X Close

Open@UCL Blog

Home

Menu

Love Data Week – UCL’s Research Data Storage Service (RDSS) now open to external collaborators!

By Kirsty, on 12 February 2021

Guest post by James Wilson, Head of Research Data Services


Over the last year we’ve been making a number of improvements to the Research Data Storage Service (RDSS) to help researchers store and access their data in a way that better corresponds to how they work.

The RDSS is a managed storage service that helps researchers comply with funders’ criteria for good data management. It provides a storage space for research projects so that anyone involved in that project has a secure area in which to store and share files with their collaborators. Projects in the RDSS do not need to be formal, externally funded projects – they can be for personal research, or small unfunded collaborations between colleagues – but the service is well adapted for large projects with compute and multi-terabyte storage requirements.

That said, the service has had some limitations in the past which we have been addressing. The foremost amongst these was that you needed to be a member of UCL in order to use it. Increasingly, however, research is undertaken with collaborators around the world or in partnership with industry. Covid-19 has only accelerated this trend. We have recently added external collaborator functionality, enabling PIs to add external project members via a simple email invitation from within the interface.

We have also integrated the RDSS with UCL’s Research Data Repository – a platform that enables data and other non-traditional research outputs to be published, cited, and preserved over the long term. Researchers with a project registered in the RDSS can now move files, including very large files, across to the repository, along with contextual information.
As the volume of data in the RDSS grows, so we extend our capacity. We added an additional 600 terabytes of capacity during 2020, and will be adding a further petabyte of storage this coming term. The first terabyte of storage for any project is provided free of charge, with larger projects charged at £50 per TB per year. This gets you two copies of your data on disk in two different physical data halls at UCL’s Slough Data centre. A third back-up copy is saved to tape, and there is a 30-day retention period to help protect against accidental deletion.

Further information about the RDSS can be found at https://www.ucl.ac.uk/isd/services/research-it-services

Love Data Week – Sharing data? Your questions answered

By Kirsty, on 10 February 2021

Guest post by James Houghton, Research Data Support Officer


Dealing with research data, and the associated legal and administrative issues, can be confusing. This article responds to some of the frequent question and confusions people have regarding research data management.

Do I always have to share data?

Not always – but in general data sharing is required unless you have a very good reason not to and UCL expects research to be shared as widely as possible. Data sharing is possibly inappropriate in the following situations:

  • The project contains personal data which could compromise the privacy of individuals. In this case the Data Protection Act (2018) applies and the data cannot be shared.
  • There is a possibility that the research could be commercialised. In this case, data should not be shared before obtaining necessary patent protections.
  • Other ethical concerns for which a justification can be created. For example, data on an endangered species might be used by poachers so it would be reasonable not to share this data.

If you are ever unsure about releasing data, speak to someone before you proceed. The Library RDM team and the Data Protection Team can advise on this.

Does UCL have a data sharing policy?

Yes, and it specifies the expectations placed on all UCL staff and students on making data available.

Be aware that in addition to the UCL policy, funding agencies will have their own requirements. You need to be compliant with all policies that might apply!

So, I need to share my data. Does UCL have a platform for data sharing?

Yes, we do! UCL has its own data repository service, the UCL Research Data Repository

I don’t have any data.

The term “data” is used as a shorthand to cover all research outputs, so even if you think you don’t have data, you probably generate something during the course of your research that should be preserved and potentially shared. Even if your field uses a different term you are probably still bound by the data sharing policy.

Here’s is a wide-ranging list of what could be considered “research data”

  • Research notebooks, detailing progress of research and experiments
  • Responses to surveys and questionnaires
  • Software, code, algorithms, and models
  • Measurements from laboratory or field equipment
  • Images (such as photographs, films, scans of documents)
  • Methods, protocols, and experimental procedures
  • Databases of collected information
  • A corpus of writings
  • Audio and video recordings
  • Interview Transcripts
  • Physical samples and objects

If you have an output not included in this list, it could can still be classed as research data!

What on earth is metadata?

Metadata is simply data that describes other data. Here are a few examples:

  • A description of the inclusion criteria for enrolling participants in a study
  • The set of questions used in interviews
  • Any file naming conventions used to keep track of data
  • The parameters used by any equipment used to make measurements
  • The dates and times images were taken
  • Details of quality assurance steps to explain why some data points were deemed to be erroneous and unsuitable for analysis
  • Administrative information such as dates of interviews, experiments or visits to a location

This is not an exhaustive list by any means! Metadata can vary considerably between projects and research fields.

In the same way data might underpin the results of a project, metadata could be said to underpin the methods of a project. If you need to address the issue of metadata, think about what another researcher would need to know to replicate the data as closely as possible.

What resources can I access at UCL to store data safely?

All UCL IT managed storage services have automated backups in place to protect data and are recommended over using your own personal devices or individual cloud storage accounts. There are a few different options depending on your needs:

  • The personal N: drive or S: drives are fine for day-to-day storage of PDFs, office documents and non-sensitive materials.
  • The Research Data Storage Service supports high speed file transfer for large quantities of data and is extremely useful for anyone who want to work with the high-performance computing clusters.
  • The Data Safe Haven is specifically designed to store personal data covered by the Data Protection Act 2018. This secure service helps you meet legal obligations on data security when relevant.
  • Services such as SharePoint and OneDrive can be useful for collaboration with colleagues and allow for functionality such as simultaneous editing of documents.

Need more information?

We have extended guidance on research data management available on our website and the library research data management team can be contacted to discuss specific issues at: lib-researchsupport@ucl.ac.uk

Love Data Week – Research Data Management at UCL: 2020 in review

By Kirsty, on 8 February 2021

To celebrate Love Data Week, the Research Data Management team have prepared a review of 2020, looking back over the past 12 months and reflecting on progress made in a number of areas.

Follow the link below to read the report and find out more about the Research Data Management and Sharing Plan review service, our new online training courses on writing data management plans and open science and scholarship and improved guidance about making research data FAIR – findable, accessible, interoperable and reusable – within the wider open science and scholarship context. You can also find out about the newly revised research data policy which includes updated advice for UCL staff and research students in managing their research outputs

Finally, you can find out about the number, amount and types of research outputs published using the UCL Research Data Repository, as well as the number and variety of views and downloads.

Download and explore the report on the UCL Research Data Repository,

 

Open Access Week: Open Data – the future treasures of the past

By Kirsty, on 21 October 2020

Here at UCL you are very often told 

Of the benefits associated with publishing via green or gold, 

But what of Research Data and saving them for later? 

What is this new thing you have stumbled across?  

Preserving research outputs and protecting them against loss. 

 

It’s Open Data of course!  

They’re freely available online  

To download and unwind  

With good quality metadata assigned.    

Open data are nestled within Open Science and Scholarship 

without barriers rooted in design.  

 

Open data with DOIs so FAIR,  

The trophies of your commitment to access and to share 

So enhancthe potential for reuse  

and reduce wasted efforts  

of those who seek to uncover new knowledgenew inference,  

Let us discover.   

 

So what is the issue? Where is the harm? 

What is the problem? Why the alarm? 

 

Too expensive they said 

Too much time they said 

Little reward for my efforts and for any of the dread.  

How are we to ascend the ladder? 

For making our data openwe’re not getting any gladder.  

  

Research Integrity, Transparency and Reproducibility 

these are your prized rewards. 

Repurpose and explore,   

To maximise the return, just what is in store?  

So take down the university wall  

Bring in the citizens,  

Address the balance so all stand equally as tall. 

 

Now, with help and support and cups of on-screen tea at the ready,  

The Research Data Management team are on-hand to keep you steady! 

We cheer those pursuing a place for research data  

in need of archiving and preserving and just plain keeping safe. 

We can talk for hours about the research data lifecycle and all that it can entail, 

open data – we shall prevail! 

Here’s to the UCL Research Data policy 

and to all those wishing to make their funders smile,  

With the UCL Research Data Repository, we can help you do it in style! 

 

So to all my fellow disruptive thinkers,  

now is the time for us all to give open data a trial  

and catch up to those who have been open a while. 

Embrace open practices and make fast  

open data – the future treasures of the past.   

By Dr Christiana McMahon | Research Data Support Officer 

FORCE11 – report from Data in the Scholarly Communications Life Cycle workshop

By Patrycja, on 20 August 2018

This summer saw the second FORCE 11 Scholarly Communication Institute (FSCI) held at the University of California, San Diego, which I had an amazing opportunity to attend. Participants of the FORCE 11 summer camp selected three courses from an extensive course list; morning classes ran through the whole week, afternoon ones took place over two days.

Geisel Library – Main Library of UCSD

In the mornings I attended Data in the Scholarly Communications Life Cycle workshops. The class was expertly and entertainingly* run by Natasha Simmons, Program Leader, Skills Policy and Resources at Australian National Data Service (ANDS). The course was structured on the 23 (research data) Things, a self-directed learning programme developed by ANDS, suitable for everyone, regardless of their skills and prior knowledge. The programme is full of resources and fascinating data, have a look yourself here.

We started with an introduction to research data (of course!) and discussed data in the scholarly communications lifecycle – this offers a framework for understanding research processes, and a good (interactive) example is available here. We also talked about data sharing models, and challenges around data sharing.

For Tuesday’s session Natasha invited Stephanie Simms from California Digital Library, who presented an introduction to data management plans and DMPonline tool. We also heard some open data stories, from Australia Telescope National Facility that makes available images of the sky collected at the facility, and from The PetaJakarta Data Sharing Project that gathers data from social media (in this case Twitter) to collect information about flooding in Jakarta.

On Wednesday Reid Otsuji from UC San Diego talked about the Open Science Framework and The Carpentries – a way of acquiring new coding and data skills for researchers and librarians. We also talked about making research data FAIR (Findable, Accessible, Interoperable, Reusable), and used FAIR data assessment tool to look at some openly available data. This provoked a discussion on how easy it is to make research data FAIR (not that easy!) and how institutions could provide the infrastructure and support that are required.

Data in the Scholarly Communications Life Cycle – Open Data Debate

 

The following morning we had guest speakers from UC Berkley, Rachael Samberg and Maria Gould, who presented on licensing research data. This was an extremely interesting talk, and discussed copyright and licencing of data both generated and used by researchers. Later in class we discussed issues around personal and sensitive data. On Friday Gustavo Durand introduced Dataverse – an open source platform developed at Harvard that allows researchers to publish, cite and archive their research data. At the end of the workshop Natasha introduced persistent identifiers and their use in data citation, and we explored different citation styles.

Hands on exercises throughout the course allowed me to experience working with research data, and see issues around data managment from researcher’s perspective. Guest speakers provided me with an opportunity to gain expert insight into many aspects of research data management, and the course structure allowed for numerous discussion and debates. This in turn made me reflect on how nuanced managing research data can be, not only when it comes to copyright and licencing.

*I learnt a lot about Australian wildlife too!