X Close

Open@UCL Blog

Home

Menu

Sensitive data – where and how to archive

Kirsty16 August 2022

Guest post by James Houghton, Research Data Support Officer

It is always essential to protect the personal identity of participants or information that could jeopardise the safety of a building, an endangered species, or similar. Deleting data at the end of a project is often necessary to guarantee privacy and security. But this data is sometimes of immense value. The potential usefulness of data could be weighed against the likelihood of an accidental release and the risk of harm if an unintentional release did occur.

There are options for archiving data with access controls for researchers who feel strongly that their data should be preserved. Some repositories have built-in access controls that ensure sensitive data can only be accessed by specific persons who have undergone an application process. Only a few data repositories offer this feature and will still have a remit controlling what data they can accept. Here are some examples.

  • ReShare (UK Data Service) – This site is a social data research repository created to share data for the Economic and Social Research Council (ESRC)-funded projects but is open for submissions from other sources!
  • The European Genome-phenome Archive (EGA) – The EGA offers service for permanent archiving and sharing of personally identifiable clinical data generated for biomedical research projects or in the context of research-focused healthcare systems

The UK Data Service, which runs the ReShare archive, provides functional on-data access control and explains how to implement it.

If you are concerned about storing the data live, even with access controls, consider storing the raw data offline. The existence of the data can be advertised online by creating an entry in a repository that announces the data’s presence and explains how to access it. The repository record will also assign a DOI to cite the dataset properly. Making sure the offline information is stored securely can be challenging, however. There needs to be a specific process to ensure the data is secure and accessible on request.

Dealing with the long-term archiving of sensitive data is complicated. The UCL Research Data Management Team can assist with this. Get in touch if you need support!

Save the Date: UCL Open Science Conference 2022

Kirsty23 February 2022

We are pleased to announce that the UCL Open Science conference 2022 will be taking place on the 6th and 7th April 2022. As last year the doors will be open to all and we ae looking forward to seeing you!

The programme design is in its final stages but across the two days we will be presenting a combination of online and in person sessions across a variety of themes:

Wednesday 6th April

Morning session (10am – 12.30pm): Online

  • What does Open Science mean to me? – Panel discussion
  • Kickstart your research with technology and Open Software – Series of talks to introduce technical tools for everyone!

Afternoon session (1.30 – 4pm): In Person – UCL campus

  • How does Citizen Science change us?

Thursday 7th April

Morning session (10am – 12.30pm): Online

  • UKRI Town Hall – Discussion hosted by David Price (UCL VP Research) and featuring Sir Duncan Wingham and Rachel Bruce
  • Open in the Global South – Series of talks on the theme, featuring Sally Rumsey and Ernesto Priego

Registration will be opening soon, but please save the date and watch this space!

What can I do with my data at the end of a project?

Kirsty18 February 2022

The important thing to remember about data as you reach the end of a project is that while sharing your data openly can have some brilliant outcomes, for you as a researcher, but also for the development of your research area, sometimes it isn’t possible to be truly open. This doesn’t mean that there is nothing you can do.

The core principles you should consider when planning and managing your data are those of FAIR. These principles apply whether your data is being made open or not – FAIR and Open are not mutually exclusive.

  • Your outputs should be Findable – this means they should be discoverable by the wider academic community and the public. If your outputs can be made open, this is a case of choosing an appropriate repository, but this still applies if your data cannot be shared. For example, if your data is commercially sensitive or cannot be fully anonymised, it should still be made known that the data exists so that interested parties can still find out about the project. They may approach you for further details or to collaborate.
  • They should be Accessible – you should make sure that you, or the system you choose uses unique identifiers, high quality metadata and a clear use of language and access protocols. This goes for data you might want to share, which could be the whole dataset, a derived subset of the data, the data that underlies a specific publication or even just a record of the project as discussed above.
  • They should be Interoperable – this means that any data structures or file types you use need to be able to be opened and used by others. This is an important consideration, as your data, code, annotations or any other file needs to be reusable into the future when the original programme you used may not be around anymore.
  • Finally, they should be Reusable – enabling the repurposing of research outputs to maximise their potential. This means that file types need to be operational long after publication as discussed above, but also files need to be annotated in such a way that someone else can use it accurately to either reproduce or build upon your research. If data cannot be shared, you can still take action. If you have used commercial or clinical datasets, or personally identifiable data that needs to remain restricted, you can share how you processed the data to reach your published conclusions, or the code you used. This will enable researchers who access or collect the same data from different participants to reproduce your research, without access to the same exact data.

When combined, these four elements help lower barriers to research outputs and facilitate secondary researchers finding, understanding, reusing and repurposing your research to realise additional research opportunities and maximise existing resources, even if you can’t share the data in full.

Whatever happens to your data at the end of a project should have been in the plan from the beginning. You should be aware from the start how much of your data can be shared, and you will have thought about where and how you were going to share it. Of course, if you aren’t sure, there are teams that can help make these decisions, and a wide range of advice available, even about less common topics like finding data to form part of your project, how to negotiate a Material Transfer Agreement in order to use it, how to securely destroy sensitive data or even how to cite data appropriately, just to name a few!

With this post we bring our Love Data Week full circle, back to the teams that can support you, and the importance of a good plan. Thank you for joining our activities this week, we hope you enjoyed it.

Data Sharing Highlights

Kirsty17 February 2022

At UCL there is a recognition that there is more to publishing research than simply books and papers in print and PDF format.

One of the ways this is supported is through the UCL Research Data Repository. Our institutional publishing platform can be accessed here: https://rdr.ucl.ac.uk/

The RDR can be used to publish all kinds of data, in a raw or visual format. All items are given a DOI and can be referenced to the same standard as a journal article or book. Many published items are supporting data for other publications, but a large number are also standalone items.

Visual Archaeological Data

A great example of using the RDR to present and share visual information is this archaeological data record: https://doi.org/10.5522/04/11385852.v1

This record is part of a project to document and catalogue archaeological sites across Central Asia. The record includes photographs of the site, height map data, 3D models along with some descriptive information about the site itself and co-ordinates which specify the location. The 3D models are also hosted live on the Sketchfab website, but including them in this data record provides an extra level of preservation. While the Sketchfab website might become defunct, this data record becomes part of the UCL permanent collection and will retained. The files could be used to reconstruct the 3D models if needed.

Sharing Research Methods

Sharing data is important, but sharing research methodology can be a really powerful way to improve reproducibility and transparency. The UCL RDR can also be used to share methods and designs used in research. It’s worth highlighting that although the RDR has “data” in the name it is very flexible in terms of what can be shared. Questionnaires, study designs, posters and presentations are all accepted.

As an example of sharing research methods this Survey of Clinical Trial units published with the repository contains not only an anonymised set of survey responses but also the exact survey document used. This means there is a clear record of the exact questions used to generate the data that could be easily used to recreate a similar survey at later date, for example, to see how trends have shifted over time.

Access this record here: https://doi.org/10.5522/04/7992998.v1

Beyond the RDR: Publishing data from the Natsal Surveys

The UCL RDR is just one platform for sharing data, and is not always the most appropriate for a particular project. A great example of presenting data in a way that makes it available to explore for a general audience is an online interactive that’s been developed using data from the last Natsal survey.

The British National Surveys of Sexual Attitudes and Lifestyles (or ‘Natsal’ for short) are a project led by UCL. So far three versions of the survey have been completed approximately every decade, the first in 1990, and since then over 45000 people have been interviewed, each one randomly-selected from across the country so that the data can be considered as broadly representative of the British population. First carried out in response to the HIV and AIDS epidemic, the Natsal surveys have become the leading source of reliable information about sex in Britain. Each of the surveys provides a snapshot of the nation’s sexual behaviour, experiences, and attitudes, and together they paint a comprehensive picture of how the nation’s sex lives change over time. Interviews for the next and fourth Natsal are due to start this summer.

The Natsal research team have been sharing their data through the UK Data Service for many years, providing a useful resource for other researchers. Data released from the last survey goes one step further to make sure the data are accessible beyond professional researchers and statisticians. Anyone will be able to explore the latest Natsal data using a freely, visually engaging interactive explorer, developed as a collaboration between UCL and the Open University, which also includes a taster animation and interactive activity ‘No sex please, we’re British!’ which allows anyone to see how their own views correspond to the survey’s findings for the population as a whole.

Getting Data in to UCL

Kirsty16 February 2022

When working with an external collaborator from another university, an industrial partner, or obtaining access to pre-existing data provided by a third party there are data management responsibilities to be aware of. Transfer of data can involve legally binding agreements governing data storage and usage. Even if a legal agreement is not required, managing the transfer and sharing of data can be challenging. Here’s a short guide to services and support at UCL.

Where to put your data?

There are several systems for storing data at UCL, depending on your needs. Here is a quick guide to determine the most suitable choice when transferring data in from external sources.

  • Dropbox (www.ucl.ac.uk/dropbox/) Not to be confused with the commercial service of the same name! This service is simple and convenient if you want to transfer a small number of files to or from an external collaborator outside of UCL as a one off and needs no specific security requirements
  • SharePoint and OneDrive for Business Is a better option if you need to transfer files between yourself and a collaborator more frequently. Once you are set up with this system it can be configured to allow an external user access.
  • Research Data Storage Service Is for large scale data storage and supports rapid transfer of large files. Once a project is set up it can be configured to allow access for external users. This is an excellent system to use for the transfer of very large data volumes.
  • Data Safe Haven This secure data storage service is suitable for personal identifiable data, and could be used in other instances when a higher level of security is desired. It conforms to the ISO 27001 standard for information security management and the NHS DSP requirements. A level of training and verification is required to get set up on the platform, so if your project requires this it’s best to begin the process as soon as possible.
  • The UCL Jill Dando Institute runs a specialised laboratory for processing data which is highly sensitive such as confidential crime data. This is probably beyond the needs of the majority of UCL projects. These facilities are suitable for those that require a system which is a Police Assured Secure Facility or need to work with UK government data marked as OFFICIAL-SENSITIVE and OFFICIAL. Similar to the Data Safe Haven there is a vetting and training process to get access.

Using surveys to collect data

Recently there has been move toward doing survey work largely online to support social distancing. Using online survey tools can be an efficient way to gather responses from participants and may even allow for an increase in the size and scope of a study, but there are data security concerns to be aware of.

  • Questionnaires can be constructed using a variety of online tools. For any research that doesn’t involve collection of identifiable personal information there is no special requirement here. In fact, the free google forms service might fit the needs of many projects.
  • UCL has access to some in-house tools that provide some extra functionality. If identifiable or otherwise sensitive data is being collected the REDcap tool can be used to send this data directly into the Data Safe Haven.
  • Of course, some projects may still make use of an in-person paper survey form. In these cases, if the forms contain identifiable or sensitive information it is necessary to store the documents in a physically secure location. If the data needs to be converted into electronic format then it should be stored in the Data Safe Haven system.
  • For survey work involving in-depth interviews with participants, the situation is a little more difficult. The nature of these sorts of interviews increases the likelihood of sensitive information being given. When conducting in-person interviews consider using recording devices with built-in encryption. If recording a zoom or teams interview make sure the recording is transferred as quickly as possible to safe storage and is not left in cloud storage for any length of time.

Data Transfer Agreements

You may require a kind of contract to get data from a third party, and there might be conditions attached to the permission to use this data. These kinds of contract are called Data Transfer Agreements (DTA) might need to be in place and is very likely to be part of the process when working with a commercial partner. Academic collaborators should also consider having a DTA drawn up, just to ensure clarity of how the data is being processed and prevent any disputes that could otherwise arise.

A data transfer agreement (DTA) is a specific version of a materials transfer agreement, which is a type of contract used when physical objects materials with scientific or commercial value and transferred between UCL and a third party. Both of these are handled by UCL Research and Innovation services who can provide more information and guidance on this if required.

Do also be aware that a data transfer agreement can be put in place by UCL staff and students when giving data to an external organisation too. They work both ways! If you ever need to share data with an external collaborator always consider if a DTA is necessary. It might be a useful tool to make sure your data is used for its intended purpose and is looked after properly.

This agreement is important as it outlines your responsibilities and permissions with respect to the data and is a legally binding agreement. There may be restrictions on what it can be used for, who can access it, and what security will be in place to protect the data. The agreement will specify details such as whether the data can or cannot be shared with other staff at UCL, with external partners, and what kind of security arrangements are required for storage.

Data Sources with specific application processes

A great deal of data is collected by government departments, healthcare providers and other services which can potentially be very valuable for researchers. UK law permits data the secondary use of personal data collected for reasons such as health, policing, education to be re-used for research if it is deemed to be in the public good. While specific agreements with other universities or commercial partners can be covered with DTAs, these data providers have well defined processes for granting access as they deal with relative large volumes of requests. Depending on the type of data you may have to meet security requirements. The UCL Data Safe Haven often helps with this. Here are some examples of data access services:

Need further guidance?

This is a brief overview of a big topic, so if you want further guidance on any of these points several teams across UCL can assist.

  • Information Governance assist with Data Safe Haven related questions: infogov@ucl.ac.uk
  • The Data Protection team also assist with questions around handling personal data: data-protection@ucl.ac.uk
  • For more general enquiries the library research data management teams are a good first contact point and can direct you to the appropriate person: lib-researchsupport@ucl.ac.uk

RDM Highlights from the past 12 months

Kirsty15 February 2022

To mark Love Data week here at UCL, the RDM team have published the now annual RDM review: https://doi.org/10.5522/04/19070309

Research Data Management (RDM) covers the decisions made – and actions taken – to manage research outputs across the research data lifecycle, complementing other components of open science and scholarship such as: Open Access, Bibliometrics, Research Integrity and Citizen Science.

As advocates of best practice in research data management, the RDM team have continued to promote UCL’s definition of ‘research data’ which emphasises that data refers to any output of research relevant to research findings and that publishing software, models, protocols, figures etc. is equally as important as publishing data.

In 2021, we reviewed a record-breaking number of data management and sharing plans; plus, we are publishing more research outputs than ever using the UCL Research Data Repository which recorded over 97000 downloads. We also continued to provide training via Moodle with almost 300 people completing writing a data management plan to date. 2021 also saw us become known as Library, Culture, Collection & Open Science and we as a team joined the Centre for Advanced Research Computing as associate staff.

Looking forwards to 2022, we plan on working even more closely with colleagues across UCL to enhance data management services throughout the research data lifecycle. We will also continue to advocate for changes to policy and practice relating to the way UCL rewards and assigns credit to all contributors to research to better acknowledge the vital contributions made by those especially in non-academic roles.

The full report can be found here: McMahon, Christiana; Houghton, James; Wallis, Kirsty (2022): Research Data Management: The 2021 Review. University College London. Presentation. https://doi.org/10.5522/04/19070309

Here’s to 2022!

Research Data at UCL – meet the teams!

Kirsty14 February 2022

Welcome to Love Data Week!

While you have probably heard of the work of the Research Data Management Team, who support you with decisions made during the research lifecycle to handle the data you work with, use or generate, from the planning stage of your project up to the long-term preservation of your data. Good data management practices are essential to meet UCL standards of research integrity.

To summarise, planning Research Data Management effectively helps you to ensure data quality, minimise risks, save time and comply with legal, ethical, institutional and funders’ requirements. The RDM team can guide you in the creation of your Data Management Plan, read and assess your plans when complete as well as advise you throughout the research process.

Contact: lib-researchsupport@ucl.ac.uk.

Outside the RDM team, there are a range of teams across the university that can support you.

The Data Protection and Freedom of Information Team are responsible for providing advice to UCL on data protection issues and handling statutory data protection and freedom of information requests. UCL’s Data Protection Officer leads the team, which sits in the Office of the General Counsel, and we work very closely with the Legal Services team and the Information Services Group.

All research proposals that involve personal data must be registered with the data protection office before processing begins. Further information, including FAQs and guidance notes are also available.

Contact: data-protection@ucl.ac.uk.

The UCL Research Ethics Committee (REC) and Research Ethics Officers facilitate an important function in the assessment of all applications submitted for ethical review and approval. The research ethics team must ensure that all applications have rigorously considered any ethical implications arising from proposed research design, methodology, conduct, dissemination, future use and data sharing and linkage, and how this will be managed should be carefully explained within a research plan. Ethical review of data management and security is a fundamental component of the ethical review procedure and researchers must demonstrate a strategy for data storage, handling of sensitive data, data retention and sharing. The following points are frequently requested:

  1. What type of data will you collect and how will you describe them?
  2. How will you store and keep your data secure?
  3. Will you be allowed to give access to your data once the project is completed? Who will be able to access them, under what conditions and for how long?

More information about research ethics, data protection for researchers and data management tools for handling sensitive and personal/special category data is available online.

Contact: ethics@ucl.ac.uk

The Research Contracts Team sits within Research and Innovation Services. Research Contracts assists Academics by putting in place appropriate agreements for sponsored research on behalf of UCL. This includes reviewing, drafting and providing advice to Academics and Departments and negotiating acceptable terms. This includes material transfer agreements of which data, including personal and pseudo-anonymised data are part. There are current exceptions to our remit which include: Clinical Trial Agreements and EU grants & contracts, procurement agreements and consultancy.

For queries please contact:

Visit their website for more information.

The Information Governance & Compliance team are a part of the UCL Information Security Team, with a focus on research compliance. They provide support for compliance aspects of data applications, such as DARS and CAG for NHS data, also the Department for Education. They also have experience with a wide range of requirement sets ranging from commercial organisations through to public services. However, for data that falls outside of formal external agreements, for example directly collected, they can offer suitable information governance advice that aligns with the Information Commissioner’s Office accountability Framework. Most requirements can be met by using the UCL Data Safe Haven (DSH).

For research that cannot use the DSH, we can help determine suitable technical and organisational measures. We also manage access to the ONS Secure Research Service (see the process section on the left hand side of linked page).

Contact: infogov@ucl.ac.uk

The UCL/UCLH Joint Research Office (JRO) provides research management and governance support for clinical research studies that take place across University College London and/or UCL Hospitals NHS Foundation Trust (UCLH). Support and guidance are provided to researchers wishing to conduct clinical research which recruits NHS patients and/or uses their tissue or their data.

This includes any clinical research that requires a formal ‘Sponsor’ as defined by the UK Policy Framework for Health and Social Care Research (2017), the Medicines for Human Use (Clinical Trials) Regulations 2004 and subsequent amendments and the Medical Devices Regulations. Sponsor authorisation for these studies are provided by the JRO or one of the UCL clinical trials units (CTU).

The JRO consists of specialist teams who interface with colleagues across UCL/UCLH to support researchers through the research process. This includes guiding researchers through the approvals processes (e.g. NHS REC/MHRA), research contracting, research finance, regulations and compliance, study set-up and conduct, and data management.

More information about the JRO and how to get in touch can be found on the website.

And finally, the Research Integrity Team oversees and supports a broad set of research integrity initiatives at UCL to ensure compliance with the Concordat to Support research integrity support UCL to ‘Pursue a responsible research agenda (UCL 2019 Research Strategy – Cross-cutting Theme A). This includes coordinating periodic audits of UCL’s adherence with research integrity standards, leading on policy matters relating to research integrity, and frameworks for supporting integrity in research, such as the Statement on Research Integrity, the Framework for Research Integrity and the Code of Conduct for Research.  The team also led on the development of training for staff and students and provide advice and advocacy across UCL.

Who knew there were so many wonderful places to get support, and information to support your research data journey! If you are reading this during Love Data Week, don’t forget that we are hosting a Research Data Clinic with members of these teams to answer your questions! Thursday 17th February 2022 at 10.30am – register your interest on the form and we will send you all the information. After Love Data Week, get in touch with the teams directly, comment below or get in touch on Twitter!

Next week is Love Data Week!

Kirsty9 February 2022

Love Data week is now upon us! We have prepared the wine, bought the chocolates, and lined up a whole host of delicious blog posts for you to enjoy!

The date for the Research Data drop-in clinic is set for Thursday 17th February at 10.30am and representatives from across the university will be ready and waiting to answer all of your questions. There is still time for you to get your link to join us by adding your email address to our contact list using this form.

And finally, throughout the week we will be sharing highlights from our suite of training videos on social media. We have a sneak preview of one of them lined up for you below! Follow us on Twitter or subscribe to the blog in the menu on the right to keep up to date!

Access the full video on MediaCentral.

Coming soon – Love Data Week!

Kirsty26 January 2022

We heart dataForget about finding a restaurant for Valentine’s Day, join us instead in the week of the 14th of February and love your data!

Starting on Monday 14th February we are bringing together colleagues that support research data from across the university just for you. We will be talking about all of the different forms that data can take, featuring profiles on different teams available to support you, sources of and tools for your data, and information about how data flows through the research process.

Every day we will be launching a new blog post, sharing videos and on Thursday 17th February at 10.30am we will be hosting a drop-in clinic for all your research data questions!

If you are interested in the drop-in clinic, please let us know using this form, this will ensure that you get the link to join us! We will be trying out a new platform called Wonder for the event that allows lots of conversations to take place at once, and for you to identify the expert that you need.

New year, new library research skills

Kirsty12 January 2022

Did you intend to make new year resolutions but did not get round to it? Why not resolve to take some time this year to further develop your library research skills and ensure you are following best practices for research? UCL Library Services provides training and support to enable you to carry out your research effectively, including online guidance and self-paced tutorials, live online training sessions, tailored and individual training and specialist enquiry services.

Here are our top 5 suggested resolutions for researchers looking to enhance their library research skills and research practices:

Be FAIR

The principles of FAIR are designed to help lower barriers to research outputs and help other researchers find and understand them in order to reuse and repurpose them. This will in turn build further research opportunities and maximise the potential benefit of resources.

Findable – making research outputs discoverable by the wider academic community and the public.
Accessible – using unique identifiers, clear metadata, use of language and access protocols.
Interoperable – applying standards to encode and exchange data and metadata.
Reusable – enabling the repurposing of research outputs to maximise their research potential.

Practise open publishing

The goal of Open Access is to make all research material openly available online without restriction, to all readers, free from the barriers imposed by subscriptions. Open access is now required by many research funders and for the REF but it also has its own intrinsic benefits such as more exposure for your work, more citations, broader reach and wider readership worldwide.

Get searching

Refine your literature searching skills for reliable, relevant and comprehensive results. Whether you are searching for references to inform your research, as background reading, to scope your research topic, for a literature review or a systematic review, a robust search strategy is essential to ensure you find all the relevant research without having to wade through excessive irrelevant results. Our support for literature searching includes a range of options to support you at every stage of your research:

Organise your references

Get the most out of reference management software such as EndNote, Mendeley or Zotero, which enable you to gather and organise references and full text documents relevant to your research and to insert references in a Word document automatically, generating a reference list in the citation style of your choice. We provide support in using EndNote, Mendeley and Zotero to help you use the software more effectively and to troubleshoot your queries:

Understand bibliometrics

Bibliometrics is concerned with the analysis of research based on citation counts and patterns. The individual measures used are also commonly referred to as bibliometrics, or citation metrics.