X Close

Open@UCL Blog

Home

Menu

Archive for the 'Guest post' Category

Getting a Handle on Third-Party Datasets: Researcher Needs and Challenges

By Rafael, on 16 February 2024

Guest post by Michelle Harricharan, Senior Research Data Steward, in celebration of International Love Data Week 2024.

ARC Data Stewards have completed the first phase of work on the third-party datasets project, aiming to help researchers better access and manage data provided to UCL by external organisations.

alt=""

The problem:

Modern research often requires access to large volumes of data generated outside of universities. These datasets, provided to UCL by third parties, are typically generated during routine service delivery or other activities and are used in research to identify patterns and make predictions. UCL research and teaching increasingly rely on access to these datasets to achieve their objectives, ranging from NHS data to large-scale commercial datasets such as those provided by ‘X’ (formerly known as Twitter).

Currently, there is no centrally supported process for research groups seeking to access third-party datasets. Researchers sometimes use departmental procedures to acquire personal or university-wide licenses for third-party datasets. They then transfer, store, document, extract, and undertake actions to minimize information risk before using the data for various analyses. The process to obtain third-party data involves significant overhead, including contracts, compliance (IG), and finance. Delays in acquiring access to data can be a significant barrier to research. Some UCL research teams also provide additional support services such as sharing, managing access to, licensing, and redistributing specialist third-party datasets for other research teams. These teams increasingly take on governance and training responsibilities for these specialist datasets. Concurrently, the e-resources team in the library negotiates access to third-party datasets for UCL staff and students following established library procedures.

It has long been recognized that UCL’s processes for acquiring and managing third-party data are uncoordinated and inefficient, leading to inadvertent duplication, unnecessary expense, and underutilisation of datasets that could support transformative research across multiple projects or research groups. This was recognised in the “Data First, 2019 UCL Research Data Strategy”.

What we did:

Last year, the ARC Data Stewards team reached out to UCL professional services staff and researchers to understand the processes and challenges they faced regarding accessing and using third-party research datasets. We hoped that insights from these conversations could be used to develop more streamlined support and services for researchers and make it easier for them to find and use data already provided to UCL by third parties (where this is within licensing conditions).

During this phase of work, we spoke with 14 members of staff:

  • 7 research teams that manage third-party datasets
  • 7 members of professional services that support or may support the process, including contracts, data protection, legal, Information Services Division (databases), information security, research ethics and integrity, and the library.

What we’ve learned:

An important aspect of this work involved capturing the existing processes researchers use when accessing, managing, storing, sharing, and deleting third-party research data at UCL. This enabled us to understand the range of processes involved in handling this type of data and identify the various stakeholders involved—or who potentially need to be involved. In practice, we found that researchers follow similar processes to access and manage third-party research data, depending on the security of the dataset. However, as there is no central, agreed procedure to support the management of third-party datasets in the organization, different parts of the process may be implemented differently by different teams using the methods and resources available to them. We turned the challenges researchers identified in accessing and managing this type of data into requirements for a suite of services to support the delivery and management of third-party datasets at UCL.

Next steps:

 We have been working on addressing some of the common challenges researchers identified. Researchers noted that getting contracts agreed and signed off takes too long, so we reached out to the RIS Contract Services Team, who are actively working to build additional capacity into the service as part of a wider transformation programme.

Also, information about accessing and managing third-party datasets is fragmented, and researchers often don’t know where to go for help, particularly for governance and technical advice. To counter this, we are bringing relevant professional services together to agree on a process for supporting access to third-party datasets.

Finally, respondents noted that there is too much duplication of data. The costs for data are high, and it’s not easy to know what’s already available internally to reuse. In response, we are building a searchable catalogue of third-party datasets already licensed to UCL researchers and available for others to request access to reuse.

Our progress will be reported to the Research Data Working Group, which acts as a central point of contact and a forum for discussion on aspects of research data support at UCL. The group advocates for continual improvement of research data governance.

If you would like to know more about any of these strands of work, please do not hesitate to reach out (email: researchdata-support@ucl.ac.uk). We are keen to work with researchers and other professional services to solve these shared challenges and accelerate research and collaboration using third-party datasets.

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, and join our mailing list to be part of the conversation!

FAIR Data in Practice

By Rafael, on 15 February 2024

Guest post by Victor Olago, Senior Research Data Steward and Shipra Suman, Research Data Steward, in celebration of International Love Data Week 2024.

Image depicting the FAIR guiding principles for data resources: Findable, Accessible, Interoperable, and Reusable. Created by SangyaPundir.

Credit: Sangya Pundir, CC BY-SA 4.0 via Wikimedia Commons

The problem:

We all know sharing is caring, and so data needs to be shared to explore its full potential and usefulness. This makes it possible for researchers to answer questions that were not the primary research objective of the initial study. The shared data also allows other researchers to replicate the findings underpinning the manuscript, which is important in knowledge sharing. It also allows other researchers to integrate these datasets with other existing datasets, either already collected or which will be collected in the future.

There are several factors that can hamper research data sharing. These might include a lack of technical skill, inadequate funding, an absence of data sharing agreements, or ethical barriers. As Data Stewards we support appropriate ways of collecting, standardizing, using, sharing, and archiving research data. We are also responsible for advocating best practices and policies on data. One of such best practices and policies includes the promotion and the implementation of the FAIR data principles.

FAIR is an acronym for Findable, Accessible Interoperable and Reusable [1]. FAIR is about making data discoverable to other researchers, but it does not translate exactly to Open Data. Some data can only be shared with others once security considerations have been addressed. For researchers to use the data, a concept-note or protocol must be in place to help gatekeepers of that data understand what each data request is meant for, how the data will be processed and expected outcomes of the study or sub study. Findability and Accessibility is ensured through metadata and enforcing the use of persistent identifiers for a given dataset. Interoperability relates to applying standards and encoding such as ICD-10, ICDO-3 [2] and, lastly, Reusability means making it possible for the data to be used by other researchers.

What we are doing:

We are currently supporting a data reuse project at the Medical Research Council Clinical Trials Unit (MRC CTU). This project enables the secondary analysis of clinical trial data. We use pseudonymisation techniques and prepare metadata that goes along with each data set.

Pseudonymisation helps process personal data in such a way that the data cannot be attributed to specific data subjects without the use of additional information [3]. This reduces the risks of reidentification of personal data. When data is pseudonymized direct identifiers are dropped while potentially identifiable information is coded. Data may also be aggregated. For example, age is transformed to age groups. There are instances where data is sampled from the original distribution, allowing only sharing of the sample data. Pseudonymised data is still personal data which must be protected with GDPR regulation [4].

The metadata makes it possible for other researchers to locate and request access to reuse clinical trials data at MRC CTU. With the extensive documentation that is attached, when access is approved, reanalysis and or integration with other datasets are made possible.  Pseudonymisation and metadata preparation helps in promoting FAIR data.

We have so far prepared one data-pack for RT01 studies which is ‘A randomized controlled trial of high dose versus standard dose conformal radiotherapy for localized prostate cancer’ which is currently in review phase and almost ready to share with requestors. Over the next few years, we hope to repeat and standardise the process for past, current and future studies of Cancer, HIV, and other trials.

References:    

  1. 8 Pillars of Open Science.
  2. Digital N: National Clinical Coding Standards ICD-10 5th Edition (2022), 5 edn; 2022.
  3. Anonymisation and Pseudonymisation.
  4. Complete guide to GDPR compliance.

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, and join our mailing list to be part of the conversation!

Finding Data Management Tools for Your Research Discipline

By Rafael, on 14 February 2024

Guest post by Iona Preston, Research Data Support Officer, in celebration of International Love Data Week 2024.

Various gardening tools arranged on a dark wooden background

Photo by Todd Quackenbush on Unsplash.

While there are a lot of general resources to support good research data management practices – for example UCL’s Research Data Management webpages – you might sometimes be looking for something a bit more specific. It’s good practice to store your data in a research data repository that is subject specific, where other people in your research discipline are most likely to search for data. However, you might not know where to begin your search. You could be looking for discipline-specific metadata standards, so your data is more easily reusable by academic colleagues in your subject area. This is where subject-specific research data management resources become valuable. Here are some resources for specific subject areas and disciplines that you might find useful: 

  • The Research Data Management Toolkit for Life Sciences
    This resource guides you through the entire process of managing research data, explaining which tools to use at each stage of the research data lifecycle. It includes sections on specific life science research areas, from plant sciences to rare disease data. These sections also cover research community-specific repositories and examples of metadata standards. 
  • Visual arts data skills for researchers: Toolkits
    This consists of two different tutorials covering an introduction to research data management in the visual arts and how to create an appropriate data management plan. 
  • Consortium of European Social Science Data Archives
    CESSDA brings together data archives from across Europe in a searchable catalogue. Their website includes various resources for social scientists to learn more about data management and sharing, along with an extensive training section and a Data Management Expert Guide to lead you through the data management process. 
  • Research Data Alliance for Disciplines (various subject areas)
    The Research Data Alliance is an international initiative to promote data sharing. They have a webpage with special interest groups in various academic research areas, including agriculture, biomedical sciences, chemistry, digital humanities, social science, and librarianship, with useful resource lists for each discipline. 
  • RDA Metadata Standards Catalogue (all subject areas)
    This directory helps you find a suitable metadata scheme to describe your data, organized by subject area, featuring specific schemes across a wide range of academic disciplines. 
  • Re3Data (all subject areas)
    When it comes to sharing data, we always recommend you check if there’s a subject specific repository first, as that’s the best place to share. If you don’t know where to start finding one, this is a great place to look with a convenient browse feature to explore available options within your discipline.

These are only some of the different discipline specific tools that are available. You can find more for your discipline on the Research Data Management webpages. If you need any help and advice on finding data management resources, please get in touch with the Research Data Management team on lib-researchsupport@ucl.ac.uk 

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, and join our mailing list to be part of the conversation!

Join us for International Love Data Week!

By Rafael, on 7 February 2024

Guest post by Iona Preston, Research Data Support Officer.

Next week (February 12-16), we’re excited to be celebrating International Love Data Week. We’ll be looking at how data is shared and reused within our UCL and academic community, highlighting the support available across UCL for these initiatives. This year’s theme, “My Kind of Data,” focuses on data equity, inclusion, and disciplinary communities. We’ll be blogging and posting on X throughout the week, so please join us to learn more.

International Love Data Week 2024 poster

Here’s a sneak preview of what’s coming up:

  • Did you know the Research Data Management team can review your data management plan and support you in publishing your data in our Research Data Repository? Find out more about our last year in review with Christiana McMahon, Research Data Support Officer.
  • Have you met any members of our Data Stewards team? James Wilson, Head of Research Data Services, will be explaining how you can collaborate with them to streamline the process of managing and preserving your data, thereby supporting reproducibility and transparency in your research.
  • Are you seeking tools to support best practices in data management for your specific discipline? We have some suggestions from Iona Preston, Research Data Support Officer.
  • You may have heard of FAIR data – but what does that mean in practice? Join Research Data Steward Shipra Suman and Senior Research Data Steward Victor Olago as they discuss projects where they’ve supported making data FAIR.
  • And, finally, to round off the week, Senior Research Data Steward Michelle Harricharan will talk about a project the Data Stewards are carrying out to better support UCL researchers in accessing and managing external datasets.

We look forward to engaging with you throughout the week and hope you enjoy learning more about research data at UCL.

And get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, and join our mailing list to be part of the conversation!

Altmetrics at UCL: one year on!

By Harry, on 29 August 2023

Guest post by Andrew Gray, Bibliometrics Support Officer

Altmetrics are the concept of “alternative metrics” – measuring the impact of research beyond scholarly literature. This covers a wide range of different things, ranging from social media discussions (e.g. Twitter or Facebook), mainstream news reporting, and grey literature such as policy documents. Understanding how research is being reported and discussed in these can help give us a broader understanding of the impact and reach of papers that we don’t see from looking at traditional scholarly citations.

UCL has a subscription to Altmetric, the primary commercial database for this information. It covers a broad range of materials. We also subscribe to a second source, which focuses purely on policy documents – Overton and can be a helpful complement.

There are several ways in which looking at altmetrics can give us information that wouldn’t otherwise be available. For example, we can see how different audiences outwith academia are responding to research, and we can look at what they’re saying to get an idea of the kind of response.

Some of the altmetric indicators (particularly Mendeley bookmarks) seem to have a close correlation with subsequent citations and can give us an early view of what citation figures may be like six months to a year in future.

Lastly, tracing policy citations through Altmetric or Overton can effectively demonstrate the wider research impact, for example, for use in a funding report or application.

Looking at activity

So what data can we see? Altmetric provides an aggregated “score” for each paper, indicating an overall activity level. While this isn’t a very exact measure, it lets us identify papers with high and low activity levels.

Looking over the past few years at UCL, the most obvious thing is that discussion of research is dominated by COVID-19. It accounts for thirteen out of the fifteen most heavily discussed UCL papers overall – by comparison, were we to look at pure citation counts, COVID papers account for none of UCL’s top fifteen overall, and only perhaps four out of the top fifteen from the past few years. This very striking difference highlights how altmetrics and citations can show different things.

The colour swatches on each show how the activity is broken down. For example, in this paper, we can see that most of the activity is from X/Twitter (light blue), with smaller contributions from Facebook (dark blue), news media (red) and blogs (yellow). Clicking through will let us drill down to see all the activity details.

Diving into data – day by day

One thing that surprised us with Altmetric is the sheer volume of data that they make available. Reports of 100,000+ papers can be downloaded, including DOIs and PubMed IDs, making it easy to link data to other sources such as RPS and InCites. This lets us do some analyses that wouldn’t be possible in other sources – but do tell us something unexpected.

For example, it gives us the exact date papers were published. Looking at around 50,000 UCL papers published in 2020-22, we find that the response differs depending on the day of the week – papers on Wednesday and Thursday are above-average, and papers on Tuesdays are below average.

In part, it is because some of the most prestigious publications have fixed publication days – most Nature papers are released on Wednesdays, for example. These journals have a large share of high-impact papers and an excellent publicising system.

The weekends are interesting. Not many papers come out on the weekends, but the ones that do, have a noticeable citation/bookmarking penalty compared to weekday ones, suggesting they are less impactful on average. And they make much less of a stir in the news media – a weekend paper is less than half as likely to get news coverage as a weekday one.

But social media has a sharp difference – Sunday papers get significantly more Twitter activity than Saturday ones. An intriguing mystery!

Using Altmetric at UCL

Altmetric and Overton are both available to any user at UCL. You simply need to log in to Altmetric using a UCL email, which will set up your user account. For Overton, you can browse the data without an individual account or set up an account to save searches and other functionalities.

We have integrated Altmetric with RPS, the central UCL publications database. Every two to four weeks, every paper in RPS since 2013 is exported, tagged with the UCL author(s) and associated departments, and uploaded into Altmetric.

This means that we can use the Altmetric dashboard to dig down into UCL outputs in some detail – we can ask it questions like “news stories in the last month referring to a piece of research published by someone in Chemistry”. It is also possible to save and circulate reports from the dashboard – this report shows the top 20 papers from Chemistry in 2023 by Altmetric Activity.

Similar functionality is not yet available for Overton, but if you would like to search for papers from a specific department, we would recommend generating a list of DOIs from InCites (or even from Altmetric itself!) and importing those as an advanced search.

We will be running introductory training sessions for both Altmetric and Overton in the coming term – please contact bibliometrics@ucl.ac.uk if you would be interested in attending these or booking a 1:1 meeting to go through the services.

Have you seen our new UCL Citizen Science website pages?

By Harry, on 15 August 2023

Guest post by Sheetal Saujani, Citizen Science Coordinator

We are pleased to launch our new and improved Citizen Science web pages on UCL’s Office for Open Science and Scholarship website. You can now access the updated content and browse what UCL is doing in this fast-growing and exciting area!

Citizen science includes a wide range of activities, and it is gaining increasing recognition among the public and within the area of research. UCL recognises citizen science as a diverse practice, encompassing various forms, depths and aims of collaboration between academic and community researchers and various disciplines.

workshop meeting
Check out our new website pages:

  • Defining Citizen Science: whether you call it participatory research, community action, crowdsourcing, public engagement, or anything else, have a look at our word cloud showing various activities and practices falling under one umbrella. UCL teams are collaborating on different projects and working together under a joint mission to strengthen UCL’s activities. This fosters stronger connections and more collaborative solutions.
  • Citizen Science projects: discover the broad range of innovative projects at UCL (grouped by discipline) showcasing various ways to use a citizen science approach in research. If you have a citizen science project to feature or have any questions, please contact us.
  • History of Citizen Science: explore the exciting history of citizen science, early definitions, and three relevant periods in modern science. Learn about one of the longest-running citizen science projects!
  • Types and levels of Citizen Science: read about the growth of citizen science, which has led to the development of three broad categories: ‘long-running citizen science’, ‘citizen cyberscience’, and ‘community science’. Citizen science practices can be categorised into a continuum using the ‘Doing It Together Science’ escalator model. This model focuses on individual participation levels, allowing individuals to choose the best level for their needs, interests, and free time.
  • UCL Citizen Science Certificate: find out about this high-quality, non-academic certification awarded to individuals who complete a training programme as part of the UCL Citizen Science Academy. The Certificate recognises research abilities through participation in active projects, enabling citizen scientists to influence local decisions.

The Office for Open Science and Scholarship is working to raise awareness of citizen science approaches and activities to build a support service and a community around citizen science.  We are bringing together colleagues who have run or are currently running citizen science projects, to share experiences and encourage others to do the same.

If you are interested in citizen science, we would like to hear from you, so please get in touch by email openscience@ucl.ac.uk and tell us what you need.

What might a Citizen Science approach in your research project look like?

By Harry, on 27 March 2023

Guest post by Sheetal Saujani, Citizen Science Coordinator

Have you thought about including members of the public in your research?  Would you like to connect and collaborate with the community around you? Alternatively, would you like to work with project leaders to answer real-world questions and gather data?

Broadly defined, citizen science is research undertaken by members of the public, often in collaboration with academic or research institutions or similar. Citizen science is a diverse practice involving various forms and aims of collaboration between academic and community researchers and a broad range of disciplines.

What are the great things about Citizen Science?

Working together as part of a community with professionals, citizen scientists can play an important part in genuine discovery, experiments, data collection and analysis. Through citizen science, any one of us can take part in extraordinary research!

We can improve our community whilst at the same time helping to provide answers to some of the big questions about the world we live in.  Whether we participate in projects that measure air quality, monitor damage from storms, or track where our rubbish is going, we can help solve problems and influence a better future for our society.

The Office for Open Science and Scholarship advocates a broad approach to citizen science, so whether you call it citizen science, participatory research, community action, co-production, public engagement, or anything else, we’re all working together to strengthen UCL activities in this area!

What do Citizen Science projects look like?

Take a look at some of the exciting citizen science projects at UCL run by various research groups and departments at UCL. Some of these projects have now been completed.

And below are a few newer ones (this list is not exhaustive):

Also, if you’re interested, there are many platforms and projects happening outside of UCL (below are just a few):

  • Thousands of people across the country take part in the Natural History Museum’s crowdsourced science projects.
  • On the SciStarter website you can join and contribute to science through thousands of amazing research projects and events.
  • With more than one million volunteers, Zooniverse is one of the biggest citizen science platforms in the UK.
  • If you’re interested in Biology, Ecology or Earth Science, check out the citizen science projects run by the National Geographic Society.
  • The InSPIRES Open Platform is an online collaborative and crowdsourced database featuring many citizen-led participatory research and innovation projects.
  • Patientslikeme is an online platform where patients can share and learn from real-time, outcome-based health data and contribute to the scientific conversation surrounding thousands of diseases.
  • The Globe at Night project aims to raise awareness about light pollution and its impacts on communities. You can report your night sky brightness observations daily.

What is UCL doing around Citizen Science?

Our Office is working to raise awareness of citizen science approaches and activities, with the aim of building a support service and a community around citizen science.  The plan is to bring together colleagues who’ve run or are currently running citizen science or participatory research projects to share good practices and experiences with each other and support and encourage others to do the same!

If you are interested in citizen science, we would really like to hear from you, so please get in touch with us via email at openscience@ucl.ac.uk and tell us what you need.

How understanding Copyright helps you open up your research

By Harry, on 14 March 2023

Guest post by Christine Daoutis, Copyright Support Officer

“Can use this image I found free online?”

“I’m not sure how much of a book or an article it’s OK to copy”.

“This is my article; surely I can post it anywhere I want?”

These questions, and quite a few others, often come up in everyday research practice. They are all related to copyright. Whether you are reusing others’ materials (documents, figures, photos, video, software, data) or creating and sharing your own, understanding copyright ensures not only that you can respect others’ rights and stay within the law, but also that you can open up your research.

But understanding copyright is much more than a legal compliance issue. It is also more than an academic integrity issue. In short, it’s not just about following the rules, but also about understanding your own rights and using the rules flexibly. You can use your copyright knowledge as a tool to open up, rather than restrict, your research. For example, relying on copyright exceptions[1], and knowing how to find, reuse and acknowledge openly licensed materials[2], can give you much more freedom in how you can reuse others’ works. Crucially, knowing your rights as authors also allows you to share your research openly and, through licensing, determine how others may reuse it. Open Science practices – open access to publications, open data, open source software and hardware, co-creation projects – rely on an understanding of copyright.

To help you increase your knowledge and confidence around copyright, you can do any of the following:

Infographic showing key UCL copyright resources. Top three resources: UCL copyright survey, copyright essentials, training sessions. Supporting resources: copyright website, copyright blog, contact the UCL copyright support team.

  1. Complete the 3-minute UCL copyright support survey to rate your confidence and tell us what support you need. If you are not sure what you need to know, the survey gives you some ideas to choose from. Currently open until 31 March 2023.
  2. Complete the 20-minute Copyright Essentials online module. You will learn the basics at your own pace, using quizzes, short videos and academic-based scenarios.
  3. Book a training session delivered the copyright support team. These can be in person or online, and offer you the chance to ask questions.
  4. Visit the UCL copyright website for guidance on specific copyright topics.
  5. Follow the copyright blog for topical articles and updates.
  6. Contact the UCL copyright support team if you have a specific question, or would like to arrange bespoke training.

 

[1] https://blogs.ucl.ac.uk/copyright/2023/02/24/fair-dealing-week-2023-part-2-three-fair-dealing-exceptions/

[2] https://creativecommons.org/faq/

 

UCL Research Data Repository: Publishing research outputs for staff and PhD students across in 2022

By Harry, on 17 February 2023

Dr Christiana McMahon & Christine Buckley – Research Data Support Officers

At UCL, we have a dedicated Research Data Repository. This can be used by staff and research students to archive and preserve research outputs. This can be anything from your datasets to a poster you presented at a conference.

What have we published?

In total, we published 162 items!

Total number of views in 2022: 172059

Total number of downloads in 2022: 117830

What is a Data Management Plan (DMP)?

By Harry, on 15 February 2023

Dr Christiana McMahon & Christine Buckley – Research Data Support Officers

A Data Management Plan or DMP is an essential part of research data management and is usually completed in the first stage of any research project. It can help you think clearly about what data you will collect and how to store, curate, back up, archive and share this data.

You’ll find that many funders include a DMP as part of their grant applications, and we are more than happy to help review these.

You can check our recently updated webpage to learn how to create your DMP. 

How do I get support?

Just email us a copy of your plan to lib-researchsupport@ucl.ac.uk, or you can create your plan in DMPonline and request feedback.

How many DMPs have we reviewed?

Over the course of 2022, we reviewed a total of 39 plans, most of which supported grant applications submitted by researchers here at UCL.

The most popular months for sharing plans for feedback with the Research Data Management team were… April, June and October!