X Close

Open@UCL Blog

Home

Menu

UCL Research Data Repository: Celebrating over 1million views!

By Naomi, on 10 June 2025

Guest post by Dr Christiana McMahon, Research Data Support Officer

Since launching in June 2019, the UCL Research Data Repository has now received over 1million views from over 190 countries and territories across the world! Plus, we have published over 1000 items and facilitated over 800,000 downloads!

This is a huge milestone and demonstrates how far reaching the Research Data Repository has become.


To date, the:

  • most viewed record is:

Heenan, Thomas; Jnawali, Anmol; Kok, Matt; Tranter, Thomas; Tan, Chun; Dimitrijevic,  Alexander; et al. (2020). Lithium-ion Battery INR18650 MJ1 Data: 400 Electrochemical Cycles (EIL-015). University College London. Dataset. https://doi.org/10.5522/04/12159462.v1

  • most downloaded record is:

Steinmetz, Nicholas A; Zatka-Haas, Peter; Carandini, Matteo; Harris, Kenneth (2019). Distributed coding of choice, action, and engagement across the mouse brain. University College London. Dataset. https://doi.org/10.5522/04/9970907.v1

  • most cited record is:

Pérez-García, Fernando; Rodionov, Roman; Alim-Marvasti, Ali; Sparks, Rachel; Duncan, John; Ourselin, Sebastien (2020). EPISURG: a dataset of postoperative magnetic resonance images (MRI) for quantitative analysis of resection neurosurgery for refractory epilepsy. University College London. Dataset. https://doi.org/10.5522/04/9996158.v1

What is the UCL Research Data Repository?

From the Research Publications Service for published manuscripts and theses, to MediaCentral for all things media, UCL staff and students can access different places to store their research outputs – and the UCL Research Data Repository is a perfect place for research data, posters, presentations, software, workflows, data management plans, figures and models.

Key features:

  • Available to all current staff and research students
  • Supports almost all file types
  • All published items can have a full data citation including a DOI (unique persistent identifier)
  • Items can be embargoed where necessary
  • Helps provide access and data sharing
  • Preserves and curates outputs for 10+ years
  • Facilitates discovery of research outputs
  • Helps researchers to meet UCL / funders’ requirements for FAIR data

More information about the service can be found on our website.

Access our user guide.

Why use the Research Data Repository?

With communities across UCL being actively encouraged to engage with the FAIR principles, it was important to give staff and research students even greater means to do so. The FAIR principles: Findable, Accessible, Interoperable and Reusable, refer to a set of attributes research outputs should have to enable secondary researchers to find, understand, repurpose and reuse these without major technical barriers​. Subsequently, there are many advantages to having FAIR research outputs including:

  • Greater accessibility of research outputs
  • Enhanced transparency of the research process
  • Greater potential to replicate studies and verify findings
  • Enhanced potential for greater citation and collaboration
  • Encourages members of the public to become involved in research projects and become citizen scientists
  • Maximises research potential of existing research resources by reusing and repurposing them

Hence, we developed and launched the Research Data Repository to support staff and research students wanting to further engage with the FAIR principles here at UCL.

Collaboration is key

The Research Data Management team in Library Services and the Research Data Stewardship team from the Centre for Advanced Research Computing, collaborate to provide both administrative and technical support – helping users to upload, publish and archive their research outputs.

You can reach us using researchdatarepository@ucl.ac.uk or join us at one of our online or in-person drop-in sessions.

What does the future hold?

Over the past year, the Research Data Repository team participated in a series of workshops as part of the FAIR-IMPACT Coordination and Support Action  funded by the European Union. This work was led by Dr Socrates Varakliotis and supported by Dr Christiana McMahon, Kirsty Wallis, Dr James Wilson and Daniel Delargy.

The aims of these workshops were to:

  • firstly, enhance the trustworthiness of the repository; and
  • secondly, to enhance the semantic metadata (documentation) made publicly available online

During the first project, we conducted a thorough self-assessment of the information we provide about the repository service with a view to highlighting how we demonstrate trustworthiness. Consequently, we made a series of improvements to our documentation including the publishing of a new, more accessible website.

Over the course of the second project, we focused on improving the standardised metadata we make available to search engines indexing repository information globally. In this project, we were able to demonstrate how having validated metadata is important to supporting the trustworthiness of repository services.

The next step is to further explore how the repository’s trustworthiness may be enhanced even further to formally meet international standards and expectations.

Final thoughts

Having over 1million views truly is a fantastic achievement and testament to the hard work and dedication of those working behind the scenes to provide this brilliant service, and the wonderful users across UCL who have published with us.

Next stop, 2million views – and until then…

Get involved!

The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on Bluesky, and join our mailing list to be part of the conversation!

Archiving beyond data at UCL

By Kirsty, on 14 February 2025

Article written by Dr Christiana McMahon, UCL Research Data Management Team

As we discussed in the blog post on Wednesday, there are ways to share and be transparent about your research without compromising ethical or legal requirements. Did you know that the UCL Research Data Repository (RDR) is a great way for staff and research students to archive, preserve, promote and publish these outputs created as part of the research process?

Have you ever been asked to archive other types of research outputs? UCL has RPS for articles, but did you know that you can also archive and share other outputs of research? Be it a whole dataset, or other types of material supporting your research such as photos, models, software, presentation slides, your Data Management Plan, even a poster.

By using the RDR, you can get a permanent identifier for your work in the form of a DOI and add it to your ORCID. Additionally, as you publish items in the UCL Research Data Repository, these are automatically collated together on a data repository webpage just for you.

With over 580,000 downloads globally, we have published over 950 research outputs that have been viewed in over 180 countries, so start publishing today!

Need to write a data management plan?

By Kirsty, on 13 February 2025

Article written by Dr Christiana McMahon, UCL Research Data Management Team

Writing a data management plan can be difficult task to approach at the best of times. There are lots of things to consider, not just going through your project in detail, you may also need to consider external funding agency requirements, UCL’s research data expectations plus the FAIR data principles, well… advice from the Research Data Support Officers might be just what you need!

There is still plenty of time to get registered for one of the upcoming courses on how to write a data management plan! Book your place online today.

Data management plans (DMPs) describe your data management and sharing activities across the research data lifecycle and are a valuable document for you to refer to throughout your research project that can help you structure and protect your data for the long term. A fully completed DMP is usually 1-3 pages in length and can even be published as an output of your research. We recommend that they are written at the start of the research and are regularly reviewed and updated over the course of your research.

For more information on data management planning and how to get in touch with the team, visit our website.

Open research…yes; Open evidence…no?

By Kirsty, on 12 February 2025

Article written by Dr Christiana McMahon, UCL Research Data Management Team

“I want to share my data, but I can’t because…” is something we hear often.

Sometimes, it’s not possible to share evidence openly and that’s okay. Let’s take a closer look at what we can do to promote research findings and foster transparency and confidence in the research process

At UCL, staff and students are actively encouraged to share their research outputs openly with the wider academic and public communities. However, openly sharing the research evidence which underpins published findings might not always be possible as there could be ethical, legal or commercial reasons prohibiting you from doing so. Hence the phrase, “as open as possible, as closed as necessary”.

While you may not be able to share your data as an output, there are many other considerations. Can you be transparent about your processes? Can you tell others how you did the research so that they can replicate your methods? Have you considered ways to anonymise or share derived subsets of your data? What about your publications associated with your research, can they be open?

There are a huge amount of options available to you. Check out the Office for Open Science and Scholarship website for advice and support on engaging with open research principles even when the research evidence cannot be made publicly accessible.

Plus, easily access different teams across UCL helping you to engage with open research:

Whose data is it anyway? The importance of Information Governance in Research

By Kirsty, on 11 February 2025

Guest post by Preeti Matharu, Jack Hindley, Victor Olago, Angharad Green (ARC Research Data Stewards), in celebration of International Love Data Week 2025

Research data is a valuable yet vulnerable asset. Research data is a valuable yet vulnerable asset. Researchers collect and analyse large amounts of personal and sensitive data ranging from health records to survey responses, and this raises an important question – whose data is it anyway?

If data involve human subjects, then participants are the original owners of their personal data. They grant permission to researchers to collect and use their data through informed consent. Therefore, responsibility for managing and protecting their data, in line with legal, regulatory, ethical requirements, and policies lie with researchers and their institution. Hence, maintaining a balance between participant rights and researcher needs.

Under the General Data Protection Regulation (GDPR) in the UK and EU, participants have the right to access, update and request deletion of their data, whilst researchers must comply with the law to ensure research integrity. However, under the Data Protection Act, research data processed in the public interest must be retained irrespective of participant rights, including the rights to erase, access and rectify. UCL must uphold this requirement while ensuring participant confidentiality is not compromised.

Information governance consists of policies, procedures and processes adopted by UCL to ensure research data is managed securely and complies with legal and operational requirements.

Support for information governance in research is now provided by Data Stewards within ARC RDM IG. That’s a long acronym, let’s break it down.

  • ARC: Advanced Research Computing – UCL’s research innovative centre and provides 1. Secure digital infrastructure and 2. Teaching software.
  • RDM: Research Data Management – assist researchers with data management.
  • IG: Information governance – advise researchers on compliance for managing sensitive data.

Data Stewards – we support researchers with data management throughout the research study, provide guidance on data security awareness training, data security requirements for projects, and compliance with legal and regulatory standards, encompassing the Five Safes Framework principles. Additionally, we advise on sensitive data storage options, such as a Trusted Research Environment (TRE) or the Data Safe Haven (DSH).

Furthermore, we emphasise the importance of maintaining up-to-date and relevant documentation and provide guidance on FAIR (Findable, Accessible, Interoperable, Reusable) data principles.

As stated above, data can be vulnerable. UCL must implement strong security controls including encryption, access control and authentication, to protect sensitive data, such as personal health data and intellectual property. Sensitive data refers to data whose unauthorised disclosure could cause potential harm to participants or UCL.

UCL’s Information Security Management System (ISMS) is a systematic approach to managing sensitive research data to ensure confidentiality, integrity, and availability. It is a risk management process involving people, processes and IT systems. The key components include information management policy, identifying and assessing risks, implementing security controls to mitigate identified risks, training users and continuous monitoring. The ISMS is crucial in research:

  1. It protects sensitive data; without stringent security measures, data is at risk of being accessed by unauthorised individuals leading to potential theft.
  2. It ensures legal and regulatory compliance i.e. GDPR and UCL policies. Non-compliance results in hefty fines, legal action and reputational damage.
  3. Research ethics demand participant data is handled with confidentiality. The ISMS ensures data management practices, data anonymisation, and controlled access whilst reinforcing ethical responsibility.
  4. It reduces the risk of phishing attacks and ransomware.
  5. It ensures data integrity and reliability – tampered or corrupted data can lead to invalid research and waste of resources.

UCL practices for Information Governance in research:

In response to the question, whose data is it anyway? Data may be generated by participants, but the overall responsibility to use, process, protect, ethically manage lies upon the researchers and UCL. Additionally, beyond compliance and good information governance, it is about ensuring research integrity and safeguarding the participants who make research possible.

It’s International Love Data Week 2025!

By Kirsty, on 10 February 2025

In true UCL tradition, we kickstart the week with the annual Research Data Management review so take a look at our poster and see what we’ve been doing in the Library!

OOSS Review of the Year

By Kirsty, on 14 January 2025

Here in the Office for Open Science & Scholarship we like to start every new year by taking a look back over the last and sharing our highlights with you.

In 2024 the Open Access Team facilitated the Gold open access publication of 3,963 papers. UCL Discovery continued to go from strength to strength, with over 53 million downloads. The publications repository now boasts over 185,000 open access items, including 24,900 theses, with over 15,500 uploads in the preceding twelve months.

The Research Data Management Team has had an equally productive year, publishing over 230 items in the data repository which has now exceeded 230,000 views and a similar number of downloads. They have reviewed over 30 data management plans and held classes for over 70 people, both online and in person. There are still seats available for term 2 that can be booked online.

Updates and publications

Across all of the teams that make up the Office we have published a whole host of documents and updates such as:

Our blog highlights

The blog has been super busy throughout the year with one of our personal highlights being the brilliant series of posts by Christine Daoutis, UCL’s Copyright Officer that looked at a range of issues around copyright in open science including a deep dive across three posts into Copyright and AI, how copyright exceptions can support your research and how copyright applies to Text and Data mining.

We also had some great events throughout the year that you can catch up on, from our annual Open Science Conference, the second annual Open Science & Scholarship Awards, and our first ever Citizen Science Community Event.

We also had a great time working with the UCL Digital Accessibility team throughout the year, they have been a great support in improving the accessibility of everything we do. We were able to highlight their work in one of our Newsletters and we also interviewed Ben Watson, Head of Digital Accessibility, who was a great sport and is an overall inspirational guy!

Upcoming in 2025

by Ray Hennessy on Unsplash https://unsplash.com/photos/gdTxVSAE5sk

We have a great year ahead in 2025, we have the imminent publication of our next Operational plan, designed to push the team to bigger and better things for the office. We are hoping to get that out in the first quarter of the year, its just going through various stages of internal feedback before we can get it out there! We will be continuing to grow our newest social media channels LinkedIn and BlueSky, and if you don’t subscribe to our newsletter, now’s your chance!

We will also be challenging ourself to bigger and better things when it comes to our conference. You all know that we like to change it up and this year we are reaching out to friends and colleagues to change our conference into our first festival!

We will also be continuing the brilliant series with Ilan Kelman on the Risks and Opportunities of Open Science. We have already shared the first two parts but keep an eye out for the last three parts coming soon!

Launching today: Open Science Case Studies

By Kirsty, on 29 April 2024

Announcement from Paul Ayris, Pro-Vice Provost, UCL Library, Culture, Collections and Open Science

A close up of old leather-bound books on a shelfHow can Open Science/Open Research support career progression and development? How does the adoption of Open Science/Open Research approaches benefit individuals in the course of their career?

The UCL Open Science Office, in conjunction with colleagues across UCL, has produced a series of Case Studies showing how UCL academics can use Open Science/Open Research approaches in their plans for career development, in applications for promotion and in appraisal documents.

In this way, Open Science/Open Research practice can become part of the Research Culture that UCL is developing.

The series of Case Studies covers each of the 8 pillars of Open Science/Open Research. They can be found on a new webpage: Open Science Case Studies 4 UCL.

It is only fair that academics should be rewarded for developing their skills and adopting best practice in research and in its equitable dissemination. The Case Studies show how this can be done, and each Case Study identifies a Key Message which UCL academics can use to shape their activities.

Examples of good practice are:

  • Publishing outputs as Open Access outputs
  • Sharing research data which is used as the building block of academic books and papers
  • Creating open source software which is then available for others to re-use and develop
  • Adopting practices allied to Reproducibility and Research Integrity
  • The responsible use of Bibliometrics
  • Public Engagement: Citizen Science and Co-Production as mechanisms to deliver results

Contact the UCL Open Science Office for further information at openscience@ucl.ac.uk.

Getting a Handle on Third-Party Datasets: Researcher Needs and Challenges

By Rafael, on 16 February 2024

Guest post by Michelle Harricharan, Senior Research Data Steward, in celebration of International Love Data Week 2024.

ARC Data Stewards have completed the first phase of work on the third-party datasets project, aiming to help researchers better access and manage data provided to UCL by external organisations.

alt=""

The problem:

Modern research often requires access to large volumes of data generated outside of universities. These datasets, provided to UCL by third parties, are typically generated during routine service delivery or other activities and are used in research to identify patterns and make predictions. UCL research and teaching increasingly rely on access to these datasets to achieve their objectives, ranging from NHS data to large-scale commercial datasets such as those provided by ‘X’ (formerly known as Twitter).

Currently, there is no centrally supported process for research groups seeking to access third-party datasets. Researchers sometimes use departmental procedures to acquire personal or university-wide licenses for third-party datasets. They then transfer, store, document, extract, and undertake actions to minimize information risk before using the data for various analyses. The process to obtain third-party data involves significant overhead, including contracts, compliance (IG), and finance. Delays in acquiring access to data can be a significant barrier to research. Some UCL research teams also provide additional support services such as sharing, managing access to, licensing, and redistributing specialist third-party datasets for other research teams. These teams increasingly take on governance and training responsibilities for these specialist datasets. Concurrently, the e-resources team in the library negotiates access to third-party datasets for UCL staff and students following established library procedures.

It has long been recognized that UCL’s processes for acquiring and managing third-party data are uncoordinated and inefficient, leading to inadvertent duplication, unnecessary expense, and underutilisation of datasets that could support transformative research across multiple projects or research groups. This was recognised in the “Data First, 2019 UCL Research Data Strategy”.

What we did:

Last year, the ARC Data Stewards team reached out to UCL professional services staff and researchers to understand the processes and challenges they faced regarding accessing and using third-party research datasets. We hoped that insights from these conversations could be used to develop more streamlined support and services for researchers and make it easier for them to find and use data already provided to UCL by third parties (where this is within licensing conditions).

During this phase of work, we spoke with 14 members of staff:

  • 7 research teams that manage third-party datasets
  • 7 members of professional services that support or may support the process, including contracts, data protection, legal, Information Services Division (databases), information security, research ethics and integrity, and the library.

What we’ve learned:

An important aspect of this work involved capturing the existing processes researchers use when accessing, managing, storing, sharing, and deleting third-party research data at UCL. This enabled us to understand the range of processes involved in handling this type of data and identify the various stakeholders involved—or who potentially need to be involved. In practice, we found that researchers follow similar processes to access and manage third-party research data, depending on the security of the dataset. However, as there is no central, agreed procedure to support the management of third-party datasets in the organization, different parts of the process may be implemented differently by different teams using the methods and resources available to them. We turned the challenges researchers identified in accessing and managing this type of data into requirements for a suite of services to support the delivery and management of third-party datasets at UCL.

Next steps:

 We have been working on addressing some of the common challenges researchers identified. Researchers noted that getting contracts agreed and signed off takes too long, so we reached out to the RIS Contract Services Team, who are actively working to build additional capacity into the service as part of a wider transformation programme.

Also, information about accessing and managing third-party datasets is fragmented, and researchers often don’t know where to go for help, particularly for governance and technical advice. To counter this, we are bringing relevant professional services together to agree on a process for supporting access to third-party datasets.

Finally, respondents noted that there is too much duplication of data. The costs for data are high, and it’s not easy to know what’s already available internally to reuse. In response, we are building a searchable catalogue of third-party datasets already licensed to UCL researchers and available for others to request access to reuse.

Our progress will be reported to the Research Data Working Group, which acts as a central point of contact and a forum for discussion on aspects of research data support at UCL. The group advocates for continual improvement of research data governance.

If you would like to know more about any of these strands of work, please do not hesitate to reach out (email: researchdata-support@ucl.ac.uk). We are keen to work with researchers and other professional services to solve these shared challenges and accelerate research and collaboration using third-party datasets.

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, and join our mailing list to be part of the conversation!

FAIR Data in Practice

By Rafael, on 15 February 2024

Guest post by Victor Olago, Senior Research Data Steward and Shipra Suman, Research Data Steward, in celebration of International Love Data Week 2024.

Image depicting the FAIR guiding principles for data resources: Findable, Accessible, Interoperable, and Reusable. Created by SangyaPundir.

Credit: Sangya Pundir, CC BY-SA 4.0 via Wikimedia Commons

The problem:

We all know sharing is caring, and so data needs to be shared to explore its full potential and usefulness. This makes it possible for researchers to answer questions that were not the primary research objective of the initial study. The shared data also allows other researchers to replicate the findings underpinning the manuscript, which is important in knowledge sharing. It also allows other researchers to integrate these datasets with other existing datasets, either already collected or which will be collected in the future.

There are several factors that can hamper research data sharing. These might include a lack of technical skill, inadequate funding, an absence of data sharing agreements, or ethical barriers. As Data Stewards we support appropriate ways of collecting, standardizing, using, sharing, and archiving research data. We are also responsible for advocating best practices and policies on data. One of such best practices and policies includes the promotion and the implementation of the FAIR data principles.

FAIR is an acronym for Findable, Accessible Interoperable and Reusable [1]. FAIR is about making data discoverable to other researchers, but it does not translate exactly to Open Data. Some data can only be shared with others once security considerations have been addressed. For researchers to use the data, a concept-note or protocol must be in place to help gatekeepers of that data understand what each data request is meant for, how the data will be processed and expected outcomes of the study or sub study. Findability and Accessibility is ensured through metadata and enforcing the use of persistent identifiers for a given dataset. Interoperability relates to applying standards and encoding such as ICD-10, ICDO-3 [2] and, lastly, Reusability means making it possible for the data to be used by other researchers.

What we are doing:

We are currently supporting a data reuse project at the Medical Research Council Clinical Trials Unit (MRC CTU). This project enables the secondary analysis of clinical trial data. We use pseudonymisation techniques and prepare metadata that goes along with each data set.

Pseudonymisation helps process personal data in such a way that the data cannot be attributed to specific data subjects without the use of additional information [3]. This reduces the risks of reidentification of personal data. When data is pseudonymized direct identifiers are dropped while potentially identifiable information is coded. Data may also be aggregated. For example, age is transformed to age groups. There are instances where data is sampled from the original distribution, allowing only sharing of the sample data. Pseudonymised data is still personal data which must be protected with GDPR regulation [4].

The metadata makes it possible for other researchers to locate and request access to reuse clinical trials data at MRC CTU. With the extensive documentation that is attached, when access is approved, reanalysis and or integration with other datasets are made possible.  Pseudonymisation and metadata preparation helps in promoting FAIR data.

We have so far prepared one data-pack for RT01 studies which is ‘A randomized controlled trial of high dose versus standard dose conformal radiotherapy for localized prostate cancer’ which is currently in review phase and almost ready to share with requestors. Over the next few years, we hope to repeat and standardise the process for past, current and future studies of Cancer, HIV, and other trials.

References:    

  1. 8 Pillars of Open Science.
  2. Digital N: National Clinical Coding Standards ICD-10 5th Edition (2022), 5 edn; 2022.
  3. Anonymisation and Pseudonymisation.
  4. Complete guide to GDPR compliance.

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, and join our mailing list to be part of the conversation!