X Close

Open@UCL Blog

Home

Menu

Love Data Week – UCL’s Research Data Storage Service (RDSS) now open to external collaborators!

Kirsty12 February 2021

Guest post by James Wilson, Head of Research Data Services


Over the last year we’ve been making a number of improvements to the Research Data Storage Service (RDSS) to help researchers store and access their data in a way that better corresponds to how they work.

The RDSS is a managed storage service that helps researchers comply with funders’ criteria for good data management. It provides a storage space for research projects so that anyone involved in that project has a secure area in which to store and share files with their collaborators. Projects in the RDSS do not need to be formal, externally funded projects – they can be for personal research, or small unfunded collaborations between colleagues – but the service is well adapted for large projects with compute and multi-terabyte storage requirements.

That said, the service has had some limitations in the past which we have been addressing. The foremost amongst these was that you needed to be a member of UCL in order to use it. Increasingly, however, research is undertaken with collaborators around the world or in partnership with industry. Covid-19 has only accelerated this trend. We have recently added external collaborator functionality, enabling PIs to add external project members via a simple email invitation from within the interface.

We have also integrated the RDSS with UCL’s Research Data Repository – a platform that enables data and other non-traditional research outputs to be published, cited, and preserved over the long term. Researchers with a project registered in the RDSS can now move files, including very large files, across to the repository, along with contextual information.
As the volume of data in the RDSS grows, so we extend our capacity. We added an additional 600 terabytes of capacity during 2020, and will be adding a further petabyte of storage this coming term. The first terabyte of storage for any project is provided free of charge, with larger projects charged at £50 per TB per year. This gets you two copies of your data on disk in two different physical data halls at UCL’s Slough Data centre. A third back-up copy is saved to tape, and there is a 30-day retention period to help protect against accidental deletion.

Further information about the RDSS can be found at https://www.ucl.ac.uk/isd/services/research-it-services

Love Data Week – Sharing data? Your questions answered

Kirsty10 February 2021

Guest post by James Houghton, Research Data Support Officer


Dealing with research data, and the associated legal and administrative issues, can be confusing. This article responds to some of the frequent question and confusions people have regarding research data management.

Do I always have to share data?

Not always – but in general data sharing is required unless you have a very good reason not to and UCL expects research to be shared as widely as possible. Data sharing is possibly inappropriate in the following situations:

  • The project contains personal data which could compromise the privacy of individuals. In this case the Data Protection Act (2018) applies and the data cannot be shared.
  • There is a possibility that the research could be commercialised. In this case, data should not be shared before obtaining necessary patent protections.
  • Other ethical concerns for which a justification can be created. For example, data on an endangered species might be used by poachers so it would be reasonable not to share this data.

If you are ever unsure about releasing data, speak to someone before you proceed. The Library RDM team and the Data Protection Team can advise on this.

Does UCL have a data sharing policy?

Yes, and it specifies the expectations placed on all UCL staff and students on making data available.

Be aware that in addition to the UCL policy, funding agencies will have their own requirements. You need to be compliant with all policies that might apply!

So, I need to share my data. Does UCL have a platform for data sharing?

Yes, we do! UCL has its own data repository service, the UCL Research Data Repository

I don’t have any data.

The term “data” is used as a shorthand to cover all research outputs, so even if you think you don’t have data, you probably generate something during the course of your research that should be preserved and potentially shared. Even if your field uses a different term you are probably still bound by the data sharing policy.

Here’s is a wide-ranging list of what could be considered “research data”

  • Research notebooks, detailing progress of research and experiments
  • Responses to surveys and questionnaires
  • Software, code, algorithms, and models
  • Measurements from laboratory or field equipment
  • Images (such as photographs, films, scans of documents)
  • Methods, protocols, and experimental procedures
  • Databases of collected information
  • A corpus of writings
  • Audio and video recordings
  • Interview Transcripts
  • Physical samples and objects

If you have an output not included in this list, it could can still be classed as research data!

What on earth is metadata?

Metadata is simply data that describes other data. Here are a few examples:

  • A description of the inclusion criteria for enrolling participants in a study
  • The set of questions used in interviews
  • Any file naming conventions used to keep track of data
  • The parameters used by any equipment used to make measurements
  • The dates and times images were taken
  • Details of quality assurance steps to explain why some data points were deemed to be erroneous and unsuitable for analysis
  • Administrative information such as dates of interviews, experiments or visits to a location

This is not an exhaustive list by any means! Metadata can vary considerably between projects and research fields.

In the same way data might underpin the results of a project, metadata could be said to underpin the methods of a project. If you need to address the issue of metadata, think about what another researcher would need to know to replicate the data as closely as possible.

What resources can I access at UCL to store data safely?

All UCL IT managed storage services have automated backups in place to protect data and are recommended over using your own personal devices or individual cloud storage accounts. There are a few different options depending on your needs:

  • The personal N: drive or S: drives are fine for day-to-day storage of PDFs, office documents and non-sensitive materials.
  • The Research Data Storage Service supports high speed file transfer for large quantities of data and is extremely useful for anyone who want to work with the high-performance computing clusters.
  • The Data Safe Haven is specifically designed to store personal data covered by the Data Protection Act 2018. This secure service helps you meet legal obligations on data security when relevant.
  • Services such as SharePoint and OneDrive can be useful for collaboration with colleagues and allow for functionality such as simultaneous editing of documents.

Need more information?

We have extended guidance on research data management available on our website and the library research data management team can be contacted to discuss specific issues at: lib-researchsupport@ucl.ac.uk

Love Data Week – Research Data Management at UCL: 2020 in review

Kirsty8 February 2021

To celebrate Love Data Week, the Research Data Management team have prepared a review of 2020, looking back over the past 12 months and reflecting on progress made in a number of areas.

Follow the link below to read the report and find out more about the Research Data Management and Sharing Plan review service, our new online training courses on writing data management plans and open science and scholarship and improved guidance about making research data FAIR – findable, accessible, interoperable and reusable – within the wider open science and scholarship context. You can also find out about the newly revised research data policy which includes updated advice for UCL staff and research students in managing their research outputs

Finally, you can find out about the number, amount and types of research outputs published using the UCL Research Data Repository, as well as the number and variety of views and downloads.

Download and explore the report on the UCL Research Data Repository,

 

Deep Dive: DOIs

Kirsty8 September 2020

In our recent blog post, PIDs 101, we covered a wide range of Persistent Identifiers (PIDs) and looked at how they link together, and what the future holds for them. This week we are drilling down to investigate Digital Object Identifiers (DOIs) in more detail.

In the last post we discussed DOIs being a unique registration number for a Digital Object, and the fact that a digital object in this context could be an article or a dataset, but it could equally be any of a number of other item types, such as on this list defined by Crossref.

How do DOIs work?

Each publisher, funder or repository that is registered to provide DOIs is given a unique registration number. This number, along with the ‘10.’ common to all DOIs, forms the first part of a DOI, called the prefix – shown below. Each registered provider is then responsible for choosing their own suffix pattern.

 

 

This is where DOIs get extra clever. Each registered provider can construct the suffixes to their own design, and these can be as simple or as complex as needed. For example, the Wellcome trust uses DOIs for identifying grants as well as publications, and PLOS uses different suffixes to identify which articles come from which journal – for example:

 

 

 

In the three PLOS DOI examples above, the unique registration number is 1371. Each suffix starts by designating the item type: journal, and then follows with an acronym of the individual journals themselves, pbio (PLOS Biology), pone (PLOS one) and pgen (PLOS Genetics). Each journal then uses article numbers in a predetermined sequence for the final part of the DOI. These numbers match the article numbers shown in the article citations. Every registered provider needs a scheme like this that they use to generate their DOIs, as it is essential that each item receives a unique DOI.

For every DOI that is generated, it is the responsibility of the provider to send metadata and a link to the top level webpage for the item to their individual registration agency. In the UK this is most likely to be Crossref or Datacite. This metadata is then made openly available so it can be used to build overarching databases or added into other tools and services like the search interface at doi.org. Crossref and DataCite make the metadata and DOIs registered with them openly available via APIs so that it can be used in databases like Europe PubMed Central.

The different publishers, repositories, universities and funders all have a responsibility to keep the metadata of all of the DOIs they generate up to date. This is important in order for the DOI to be persistent. For example, if your chosen journal changes publisher after your article has been published, it is the responsibility of the publisher to facilitate updating the metadata of every article so that you will still be able to find your article using the DOI.

Why is having a DOI beneficial?

The purpose of a DOI is to accurately identify, link to and discriminate between online works. DOIs are unique to the work they identify and permanently link to it. This means that a DOI must link to the authoritative and authentic web presence for the work hosted on a sustainable platform.

So, having a DOI for your work (whatever it may be) means that it will always be findable: even if the journal where it was originally published no longer exists, there will always be a record of your work no matter how much time has passed. It also helps ensure that your work is cited properly, and that every mention of it is correctly attributed and easy to track. If your work has a DOI, it can be included in other tools like Altmetric or Plum Analytics. These tools track mentions of works in social media, news media, policy documents and other places.

How do I get a DOI for my work?

It is relatively unusual for journals to be unable to provide you with a DOI for your article. If your publisher does not have the facility to give you a DOI, or you wish to get a DOI for another type of material, the simplest way to go about getting one is to create a record in a repository that can provide a DOI for you.

At UCL we have the Research Data Repository (RDR) which can accept a wide range of outputs including data, figures, presentations, software, posters, even images and other media. There is the option in the record creation process to ‘Reserve’ a DOI which will become live once the record is checked and verified by the RDR team.

Outside UCL, there are also independent repositories that are able to give you a DOI. You can choose a subject repository appropriate for your data – there is lots of information available on the Research Data Management team website – or a generic one such as the UK Data Archive, Zenodo, Figshare or Dryad.

Open Access and your Research in a COVID-19 World

Kirsty6 May 2020

On 20 March, days after lockdown began, JISC and partners issued a statement calling for Publishers to help in the global effort to combat COVID-19 and support institutions and students to continue their education by making resources available where possible. Since that day, numerous publishers have made temporary changes to their policies, and have begun to make more content freely available online. The Library has been maintaining a list of these newly open resources on the website, along with other help and advice for finding and using resources remotely. There are also lists of resources available from the British Library as well as a brilliant collated list of data and computational resources from the National Institute of Health.

The Copyright Licensing Agency has also made some temporary adjustments to the licence that allows books to be scanned and shared. Please contact the Teaching & Learning Services team for more information.

In addition, there are now tools that allow you to search the web for trustworthy Open Access versions of content from inside your web browser. Just searching Google can bring up not only illegal copies of material, but also inadvertently support predatory and fake journals. The recommended tool is called Open Access Button. More information about Open Access Button is available here

Open Access choices

Just because publishers are making things open for the time being, doesn’t mean they will stay that way. Be careful about the choices you make for your research – in the long term, will the publisher of your chosen journal stop access to your paper? When you are choosing the journal to submit your research to, take a look at the guidance provided by the Open Access team, and also check Sherpa/Romeo to find out whether you are allowed to share your work on RPS, or even on a pre-print service to get it out there even faster!

Don’t forget that you can use the Research Publications Service (RPS) as well as the Research Data Repository (RDR) to take advantage of Open Access to share all of your research outputs to get them out to the rest of the research community.

Doctoral theses in UCL’s repository

Patrycja25 October 2018

At UCL, candidates for research degrees are required to deposit an electronic copy of their final thesis in UCL’s Research Publications Service (RPS), to be made open access in UCL’s institutional repository, UCL Discovery. Students can choose to restric public access to their thesis, for a variety of reasons like future publication, copyright restriction or sensitive data, but most are made open access immediately, or after a delay period no longer than 12 months.

The requirement to submit an electronic copy of your thesis as a condition of award has been in place at UCL since 2009. In addition to that, we have retrospectively digitised theses from earlier years, as a part of a collaborative project with ProQuest. So far, about 3,500 theses have been made available in UCL Discovery as a part of this collaboration. Theses are also digitised through the British Library’s e-Theses Online Service (EThOS), upon request.

In total, there are over 10,500 theses available in UCL’s institutional repository, dating as far back as 1933. UCL theses are amongst our most-downloaded items! The most popular is a 1990 thesis, Marketing theories and concepts for the international construction industry, available here. Amongst the theses available there are some completed by notable UCL alumni:

Julian Baggini, philosopher and author of popular books on philosophy, including A Short History of Truth, The Pig that Wants to be Eaten and 99 other thought experiments, and most recently How the World Thinks. Baggini completed his PhD in 1996, and his thesis on philopsphy of idnetity was recently made available here: http://discovery.ucl.ac.uk/10057733/

Adam Rutherford, geneticist and author, has produced several science documentaries, and hosts the BBC 4 radio programme Inside Science. He completed his PhD at UCL in 2002, and his thesis on the role of a specific gene (CHX10) on eye development was recently made available in UCL Discovery: http://discovery.ucl.ac.uk/10057801/

Chris Van Tulleken, together with his twin brother and fellow doctor Xand, makes programmes on various aspects of health, most recently Operation Ouch for CBBC. He is also an infectious diseases doctor and MRC Clinical Research Fellow at University College London Hospital, and completed his PhD in 2017. Chris’ thesis is available here: http://discovery.ucl.ac.uk/1567969/