X Close

Open@UCL Blog

Home

Menu

Archive for the 'Open Science' Category

The benefits and barriers to code sharing as an Early Career Researcher

Kirsty14 September 2021

Guest post by Louise Mc Grath-Lone, Research Fellow (UCL Institute of Health Informatics), Rachel Pearson, Research Assistant (UCL Institute of Child Health) and Ania Zylbersztejn, Research Fellow (UCL Institute of Child Health)

In July 2021, we held a session on code sharing as part of the UCL Festival of Code and were thrilled to have almost 90 attendees from 9 out of UCL’s 11 faculties – highlighting that researchers from across a wide range of disciplines are interested in sharing their code.

The aims of the session were to highlight the benefits of code sharing, to explore some of the barriers to code sharing that Early Carly Researchers may experience, and to offer some practical advice about establishing, maintaining and contributing to a code repository.

In this blog, we summarise the benefits and barriers to code sharing we discussed in the session taking into account the views that participants shared.

What is code sharing and what are the benefits?

Code sharing covers a range of activities, including sharing code privately (e.g., with your colleagues as part of internal code review) or publicly (e.g., as part of a journal article submission).

For Early Career Researchers in academia, there are many benefits to sharing code including:

Reducing duplication of effort: For activities such as data cleaning and preparation, code sharing is an important method of reducing duplication of effort among the research community.

Capturing the work you put into data management: The processes of managing large datasets are time-consuming, but this effort is often not apparent in traditional research outputs (such as journal articles). Sharing code is one way of demonstrating the work that goes into data management activities.

Improving the transparency and reproducibility of your work: Code sharing allows others to understand, validate and extend what you did in your research.

Enabling the continuity of your work: Many researchers spend the early years of their career on fixed-term contracts. Code sharing is a way to enable the continuity of your work after you’ve moved on by allowing others to build on it. This increases the chances of it reaching the publication stage and your efforts and inputs being recognised in the form of a journal article.

Building your reputation and networks: Code sharing is a way to build your reputation and grow your networks which can lead to opportunities for collaboration.

Providing opportunities for teaching and learning: By sharing code and by looking at code that others have shared, Early Career Researchers have opportunities to both teach and learn.

Demonstrating a commitment to Open Science principles: Code sharing is increasingly valued by research funders (e.g. the Wellcome Trust) and is a tangible way to show your commitment to Open Science principles which are part of UCL’s Academic Framework and important for career progression.

Despite the clear benefits to code sharing, at the start of our session just 1 in 4 participants (26%) said that they often or always share code. However, by the end of the session, almost all participants (90%) said that they definitely or probably will share their code in the future.

What are the barriers to code sharing as an Early Career Researcher and how we can overcome them?

We asked participants what has put them off sharing their code in the past. The most common responses were:

The time and effort required: Ideally, you would write perfectly formatted and commented code on the first go – however, in reality, it often does not work out like this. As you update code and encounter bugs, code can often become messy and considerable time/effort needed to get it to point it can be understood by someone outside the research project. We discussed the importance shifting your perception of ‘shareable’ code. Sharing any code, even if messy, is far more helpful than sharing nothing at all.

Lack of confidence and concerns about criticism: Many researchers who write code as part of their work have very little (or no!) formal training. This means that sharing code can be daunting. For example, researchers may be worried about others finding errors in their code; however, sharing code can help to catch bugs in code early on and can bolster your confidence and reassure you that your code is correct. In the session, we also discussed how getting involved with online coding communities that emphasize inclusivity and support (e.g., R Ladies, Tidy Tuesday or one of the UCL Coding Clubs) can help grow confidence and provide a kinder environment in which to share code publicly.

Not knowing how to share or who to share with: A lack of formal training means that many researchers are unsure about where or how to share code, including not knowing which license to use to enable appropriate reuse of code. We discussed the need for more training opportunities, encouraged setting up your own code review groups (like a journal club, but for sharing and discussing code).

Worry that code will be reused without permission: Some participants were worried about plagiarism and their hard work being re-used without their knowledge or permission. However, hosting your code in a repository like GitHub allows you to choose suitable licence for re-use of your code to prevent undesired use while still supporting open science! You can also see how many people have accessed your code.

How can Early Career Researchers get started with code sharing?

Preparing code to share can take time and, as they work to secure their future within academia, many Early Career Researchers may already feel overloaded and pulled in different directions (e.g., teaching, institutional citizenship, engagement work, producing publications, attending conferences, research management, etc.). However, code sharing is hugely beneficial for a career in academia and so we would encourage all Early Career Researchers to try to find the time to share code by viewing it as an opportunity to invest in your future self. For example, you could:

  • Adopt a coding style guide to help produce clear and uniform code with good comments from the outset. This will reduce effort end when you come to share code (and help your future self when you look at your code many years later and have inevitably forgotten what it all does!
  • Join a UCL Coding Clubs or online community to learn tips from others about coding and sharing code.
  • Learn to use a code repository like GitHub. As part of our session, we delivered an introductory tutorial on how to use GitHub with links to other useful resources (available here).

How can UCL support Early Career Researchers to share code?

We ended the session by asking the participants how UCL could better support them to share their code. Some of the ideas suggested by Early Career Researchers were:

More training on writing and sharing code: For example, one suggestion was that UCL could create a Moodle training course for code sharing. Training about best practice in coding (across several languages) to help Early Career Researchers to write code right the first time would also be helpful.

Simple, accessible guidance about code sharing: This might include checklists or 1-to-1 advice sessions, in particular, to help Early Career Researchers to select the right licenses.

Embed code sharing as best practice at all levels: Encouraging and supporting senior researchers to share code so that it becomes embedded as good practice at all levels would provide a good example for and encourage more junior members of staff. It would also help to ensure that the time and training required to prepare code for sharing is built into grant applications.

Knowledge sharing opportunities: More events and opportunities to discuss how research groups share code to share best practice across faculties throughout UCL.

 

We would like to thank everyone who attended our session – “Code sharing for Early Career Researchers: the good the bad and the ugly!” – at the UCL Festival of Code for their time and contributions to the lively discussions. All the materials from the session are available here, including an introductory tutorial to getting started with code sharing using GitHub. We would also like to thank the organisers of the UCL Festival of Code for their help and support.

Open Science & Scholarship – the name debate

Kirsty3 September 2021

The phrase ‘Open Science’ originated in Europe, but when used in English gives us a slight issue. To most English speakers, the word Science conjures a specific image to most people – of scientists in labs, chemistry, physics… not the all-encompassing view that Open Science actually represents.

The word ‘science’ here actually draws on the Latin word scientia, meaning knowledge, which is why outside the UK is it much more easily accepted for what it represents. Which is opening up what we know, the data we used to learn it and how we got there, across all areas of research, not just Natural Sciences but Social Sciences, Arts and Humanities.

To combat the misunderstanding about the word science, a whole host of alternatives are used across academic institutions in the UK, there is a great article on the UKCORR blog by Nick Shepherd which touches on this and lists other universities choices and position statements.

Here at UCL, back in October 2020 we launched the UCL Office for Open Science & Scholarship, fully embracing the term Open Science and its connection to Europe, building on the LERU 8 Pillars of Open Science but also including Open Scholarship in an effort to make the name sound more inclusive here in the UK, pushing back against that instinctive image that is conjured by the word science.

We have recently released a full discussion of the principles behind the Office for Open Science & Scholarship, our name, and our commitments: https://www.ucl.ac.uk/library/research-support/open-science/defining-open-science-scholarship

ORCiDs in RPS: have you added yours?

Kirsty23 July 2021

There are a number of ways that having an ORCiD can be useful:

  • you can use it to distinguish yourself from other researchers, especially if you have a common name,
  • you can use your ORCiD to easily find and connect to your outputs, activities, contributions and affiliations
  • your ORCiD iD can also be used in place of a publications list or CV in applications to present your full list of contributions in one place
  • and finally, you can connect your ORCiD to a growing number of institutions, funders, and publishers, including RPS here at UCL.

Linking your ORCiD to your account in RPS can have a number of additional benefits, key among which is to improve the accuracy of the auto-claiming of your publications. In addition to this, you can also allow RPS to send publications that you claim over to ORCID on your behalf, called ‘Read and Write’ in the table below.

School ORCID Read & Write in RPS
– Jun 21
Total ORCID in RPS
– Jun 21
IOE 42% 77%
BEAMS 30% 73%
SLMS 29% 71%
SSEES 13% 76%
SLASH 22% 66%
Total 29% 71%

Back in January 22% of research staff had linked their ORCiD to RPS and were using it to send content from RPS to their ORCID record. Now, 6 months later that total has increased to 29% with IOE leading the way with an impressive 42% of research staff using this feature.

Overall, over 70% of research staff at UCL have linked their ORCID to RPS in some way, but that means that there are still some people that aren’t taking advantage of this and using their ORCiD to its best effect.

To get more information about how to add your ORCiD to RPS, take a look at the guide provided by the Open Access team, or one of our previous blog posts that outlines more information about the ways to best use your ORCiD.

Copyright and Text & Data mining – what do I need to know?

Kirsty6 July 2021

Text and Data Mining (TDM) is a broad term used to cover any advanced techniques for computer-based analysis of large quantities of data of all kinds (numbers, text, images etc). It is a crucial tool in many areas of research, including notably Artificial Intelligence (AI). TDM can be used to reveal significant new facts, relationships and insights from the detailed analysis of vast amounts of data in ways which were not previously possible. An example would be mining medical research literature to investigate the underlying causes of health issues and the efficacy of treatments.

The importance of having copyright exceptions in place to facilitate TDM arises from the fact that the swathes of material which need to be mined are often protected by copyright. That would be true for example of “literary works” of all kinds and of images in many cases. It is frequently the case that researchers will have lawful access to the material but will be prevented from applying TDM techniques because copying the material onto the required computer platform risks legal action for infringement on the part of the copyright owners. “Copying” is of course one of the acts restricted by copyright law and in general the greater the amount and variety of material, the greater the copyright risk.

It is worth remembering that when the Government created an exception for Text and Data Mining in 2014, it meant that the UK was ahead of the game. Other countries did not generally have an exception in their legislation at that time. Since then, other jurisdictions have caught up and, in some cases overtaken the UK. Cutting edge research is a highly competitive area and researchers working in a country which benefits from a generous TDM exception will have a distinct advantage.

The existing exception is still significant from the Open Science perspective in enabling research projects where computer analysis of large quantities of copyright-protected material is required, particularly in the context of AI.

Let’s take a closer look at the UK TDM exception and what it allows us to do, before comparing it briefly with the more recent EU exceptions. The UK exception is to be found in Section 29A of the Copyright, Designs and Patents Act 1988.

What does the exception allow us to do?

Copying copyright-protected works in order to carry out “text and data analysis” (“computational analysis” in the wording of the exception). The need to copy arises because researchers must have have the material to be analysed on a specific platform, to carry out the analysis. The need for the exception then arises because without it, the researcher would require permission from the owner of copyright in each item. Without permission (or an exception), the researchers would be infringing copyright by copying a vast swathe of protected material. That in turn would often make the research impractical to carry out.

Who may do this?

Absolutely anyone, the exception says “a person.” This is wonderfully broad and one of the more favourable aspects of the UK exception. For example you don’t need to be working for/ studying at a particular type of institution to benefit from the exception.

Are there conditions?

You must have lawful access to the material. A prime example would be the text of academic journals. We have lawful access to large numbers of e-journals because UCL Library subscribes to them. The exception would allow a UCL researcher to download large amounts of content from e-journals to carry out detailed analysis using specialised tools. It is important to note that the exception cannot be overridden by contract terms. It follows that a term in an e-journal contract seeking to prevent TDM would have no force, in circumstances where the exception applies. This makes the exception a much more useful tool than it would otherwise be.

As you might expect the copies made for TDM purposes may not be used for other purposes, shared etc under the exception.

Significantly, the analysis must be “…for the sole purpose of research for a non commercial purpose.” This is a major restriction, which would rule out many situations where TDM might be used, for example research by a pharmaceutical company developing new drugs which will be marketed commercially. A major issue with the exception is that it can be unclear at what point “non-commercial” shades into “commercial.” A project which starts out as academic research may take on commercial significance down the line and a piece of research with no commercial aspects may be funded by commercial sponsors. It is an important constraint in the legislation which can also be difficult to be sure about in real life situations. It can stand in the way of joint projects by HEIs and commercial organisations.

Still, in situations where we can claim there is no commercial aspect to the research, the exception is potentially very useful. In addition to material which is already digital it can cover projects where digitisation of copyright- protected print material is required to be analysed. It can be very useful in situations where the copyright status of the source material is unclear, since provided the exception applies, there is no need to investigate further the complexities of copyright in the material.

The new EU TDM exception or rather exceptions

The EU Directive on Copyright in the Digital Single Market (DSM Directive) offers two new exceptions, which EM member states are obliged to transpose. They can be found in Articles 3 and 4 of the Directive.

There are important differences of approach to the UK in the answer to the question:  who may carry out the TDM? Article 3 provides an exception which benefits two defined categories of organisations: “Research organisations” and “Cultural heritage organisations.” Included within those groups are for example universities, museums, publicly funded libraries. Commercial organisations are excluded. It seems that independent researchers, not associated with an organisation would also be excluded, even though their research might be “non-commercial.” In common with the UK legislation, this exception cannot be overridden by contract terms and is therefore a powerful tool. The Directive addresses the question of public-private research collaborations in the recitals to the directive, e.g. recital 11. They are not excluded from benefitting from the Article 3 exception.

Article 4 offers a separate TDM exception which is available to anyone (including commercial organisations) but which is limited in a specific way: If the rights owners explicitly reserve the rights to carry out TDM within their works, then it cannot be mined under the exception. In other words, the EU DSM Directive goes one step further than the UK by offering an exception which can be used to mine lawfully accessible works by commercial organisations (or by anyone else), but it does not apply if the rights owner has explicitly ruled out TDM.  By contrast, commercial organisations would not be able to use the UK exception, unless they can claim the specific research is for a non-commercial purpose.

Guest post by Chris Holland, UCL Copyright Support Officer. For more information or advice contact: copyright@ucl.ac.uk

Introduction to the CRediT taxonomy

Kirsty21 June 2021

The Contributor Roles Taxonomy (CRediT) describes 14 roles that represent the parts typically played by contributors to a scholarly output. The CRediT taxonomy has been adopted across a growing range of publishers to improve the visibility of the range of contributors to published research outputs. The established list of publishers and individual journals that use the roles is available online and also includes a few submission, peer review and research workflow tools.

The taxonomy also brings a number of additional practical benefits to the research environment, including:

  • Reduce the potential for author disputes.
  • Enable visibility and recognition of the different contributions of researchers, particularly in multi-authored works – across all aspects of the research being reported (including data curation, statistical analysis, etc.)
  • Support identification of peer reviewers and specific expertise.
  • ​Enable funders to more easily identify those responsible for specific research products, developments or breakthroughs.
  • Improve the ability to track the outputs and contributions of individual research specialists and grant recipients.
  • Easy identification of potential collaborators and opportunities for research networking.
  • Enable new indicators of research value, use and re-use, credit and attribution​.

We have recently added information about the CRediT taxonomy to the Open Access website, to make sure that you can get all information related to publishing your research in the same place, and as always, the Office for Open Science & Scholarship, and the Open Access team are available to answer any questions you may have on this or any other related topic.

CRediT updates

In April 2020 the National Information Standards Organization (NISO) announced the formal launch of its work to develop the Contributor Role Taxonomy (CRediT) as a full ANSI/NISO standard.

Later in 2020, CRediT was awarded grant funding from the Alfred P. Sloan Foundation and Wellcome Trust. The funds will be used to support implementations of the taxonomy across scholarly publishers, and within the scholarly research ecosystem more broadly once the standard is established.

During the early part of 2021, ORCID officially started supporting CRediT. As part of the upgraded API, journals can share CRediT contributions with ORCID and include them in your ORCID record. For more information about ways to automate updates to your ORCID record, check out our blog post on the subject.

Open Science & Scholarship Newsletter – June 2021

Kirsty2 June 2021

Welcome to the third issue of the Open Science and Scholarship Newsletter!

This termly newsletter has updates across the 8 Pillars of Open Science, and contributions from colleagues across the university. If you would like to get involved, give feedback or write something for a future issue, please get in touch using the details at the end of the newsletter or by leaving a comment below.

In this issue:

  • Editorial
  • Update from the Head of the Office for Open Science & Scholarship
  • Community voice – Reliability and Reproducibility in Computational Science
  • Special Feature – Open Science in Horizon Europe
  • Deep Dive – Top posts from our blog
  • News and Events

Go to the newsletter on Sway, or view it below. If you use the version below, we recommend clicking the ‘full screen’ button to get the full experience!

When viewing a Sway, you can turn on Accessibility view. This view displays a high-contrast style for easier reading, disables any animations, and supports keyboard navigation for use with screen readers.

To turn on Accessibility view:

  • If you’re using a mouse or touchscreen, on the More options menu (shown as three dots on the Sway toolbar), choose Accessibility view.
  • If you’re using a screen reader, on the More options menu, when Accessibility view is selected, you hear “Displays this Sway in a high contrast design with full keyboard functionality and screen reader access to all content.”

Upcoming webinar: Focus on Open Science

Kirsty1 June 2021

The UCL Office for Open Science & Scholarship is collaborating with the University of Stockholm and Scientific Knowledge Services on organising an Open Science Webinar on 18 June.

Open Science started as a vision, aiming to address matters like research reproducibility and access to the results of publicly-funded research. The vision was generally welcomed by academic and research institutions and has benefited from a great advocacy movement. It’s high time now to build on practice and effective management.

It is generally accepted in Europe that research should be as open as possible and as closed as necessary. Finding the borderline between the two is one of the most important tasks for practitioners, whether they belong to funders, research organisations, their partners or researchers themselves. This borderline is not sufficiently explored. Guidelines based on feedback and learning from practice should be created, sooner rather than later. This innovative approach to research has further potential: to address existing inequalities and matters like inclusivity, ethics, better assessment or the missing links between science and society or to re-shape public-private partnerships.

Emphasizing research practices, we will discuss the role of research organisations to support this transition, both acting local and internationally.

The webinar is a part of the #FocusOpenScience series. The language of the presentations will be English.

Visit https://www.focusopenscience.org/book/21stockholm/ for further details, and to register.

UCL Open Science Conference Day 2: Tuesday 27th April

Kirsty30 April 2021

We have now collated all of the recordings and uploaded them to UCL Media Central, a full write-up of the event and some remaining questions will follow next week.

Day 1 content is also available

13:10 – 13:30 Count-erproductive? The role of metrics in the advancement of Open Science: Lizzie Gadd

Open in Media Central or view below

13:40 – 14:00 Toolkit for Transparency, Reproducibility & Quality in Energy Research: Gesche Huebner & Mike Fell

Full paper link https://journal-buildingscities.org/articles/10.5334/bc.67/

Open in Media Central or view below

14:20 – 15:00 Reproducibility, Transparency & Metrics panel

Open in Media Central or view below

15:25 – 16:05 Citizen science panel

Open in Media Central or view below

Links to Monica’s projects:

UCL Open Science Conference Day 1: Monday 26th April

Kirsty30 April 2021

We have now collated all of the recordings and uploaded them to UCL Media Central, a full write-up of the event and some remaining questions will follow next week.

Day 2 content is also available

13:10 – 13:40 Open Science – looking to the future: Jean-Claude Burgelman

13:40 – 13:55 Open Science at UCL – looking to our future: Paul Ayris

Open in Media Central or view below

14:20 – 15:00 Future of Open Science panel

Open in Media Central or view below

15:25 – 16:05 Technical solutions panel

Open in Media Central or view below

ORCID Updates for 2021

Kirsty14 April 2021

Over the past year, we have written a number of blog posts talking about ORCID and giving you lots of options for how you can make the best use of your ORCID, including using it to add your research outputs to RPS, and a series of ways that you can automatically populate your ORCID and save time! While all of these posts are still relevant, and we would recommend you having a look, there are a few updates that we wanted to share with you.

ORCID have recently added Data Management Plan as a new work-type you can include in your ORCID, which is great news. In addition to this, ORCID have now made it possible to record funding peer review contributions in your ORCID record by linking your ORCID to Je-S, increasing the number of work types you can add to ORCID to 44!

ORCID have also relaunched the help and support part of their website info.orcid.org to make it easier to access updates, FAQs and blog posts. I really enjoyed this recent post in which they interviewed Dr. Romero-Olivares, assistant professor at New Mexico State University, about her experiences using ORCID throughout her career and the ways that having an ORCID has made maintaining her CV easier over the years.

After this blog was published, ORCID also announced that they have started supporting CRediT – the Contributor Roles Taxonomy. This is a great step, and so keep an eye out if you have published in a journal that uses CRediT to add this to your ORCID record soon!

Finally, ORCID have released a new video tour of the ORCID record that you can see below. In addition to their previous video in our prior posts telling you about what ORCID is and its advantages, this video aims to remind you of the key features of the interface and answering a few questions you may have about how to maintain your personal ORCID record.

A Quick Tour of the ORCID Record from ORCID.