X Close

Open@UCL Blog

Home

Menu

Open Source Software Design for Academia

By Kirsty, on 27 August 2024

Guest post by Julie Fabre, PhD candidate in Systems Neuroscience at UCL. 

As a neuroscientist who has designed several open source software projects, I’ve experienced firsthand both the power and pitfalls of the process. Many researchers, myself included, have learned to code on the job, and there’s often a significant gap between writing functional code and designing robust software systems. This gap becomes especially apparent when developing tools for the scientific community, where reliability, usability, and maintainability are crucial.

My journey in open source software development has led to the creation of several tools that have gained traction in the neuroscience community. One such project is bombcell: a software designed to assess the quality of recorded neural units. This tool replaces what was once a laborious manual process and is now used in over 30 labs worldwide. Additionally, I’ve developed other smaller toolboxes for neuroscience:

These efforts were recognized last year when I received an honourable mention in the UCL Open Science and Scholarship Awards.

In this post, I’ll share insights gained from these experiences. I’ll cover, with some simplified examples from my toolboxes:

  1. Core design principles
  2. Open source best practices for academia

Disclaimer: I am not claiming to be an expert. Don’t view this as a definitive guide, but rather as a conversation starter.


Follow Julie’s lead: Whether you’re directly involved in open source software development or any other aspect of open science and scholarship, or if you simply know someone who has made important contributions, consider applying yourself or nominating a colleague for this year’s UCL Open Science and Scholarship Awards to gain recognition for outstanding work!


Part 1: Core Design Principles

As researchers, we often focus on getting our code to work, but good software design goes beyond just functionality. In order to maintain and build upon your software, following a few principles from the get go will elevate software from “it works” to “it’s a joy to use, maintain and contribute to”.

1. Complexity is the enemy

A primary goal of good software design is to reduce complexity. One effective way to simplify complex functions with many parameters is to use configuration objects. This approach not only reduces parameter clutter but also makes functions more flexible and maintainable. Additionally, breaking down large functions into smaller, more manageable pieces can significantly reduce overall complexity.

Example: Simplifying a data analysis function

For instance, in bombcell we run many different quality metrics, and each quality metric is associated with several other parameters. In the main function, instead of inputting all the different parameters independently:

[qMetric, unitType] = runAllQualityMetrics(plotDetails, plotGlobal, verbose, reExtractRaw, saveAsTSV, removeDuplicateSpikes, duplicateSpikeWindow_s, detrendWaveform, nRawSpikesToExtract, spikeWidth, computeSpatialDecay, probeType, waveformBaselineNoiseWindow, tauR_values, tauC, computeTimeChunks, deltaTimeChunks, presenceRatioBinSize, driftBinSize, ephys_sample_rate, nChannelsIsoDist, normalizeSpDecay, (... many many more parameters ...), rawData, savePath);

they are all stored in a ‘param’ object that is passed onto the function:

[qMetric, unitType] = runAllQualityMetrics(param, rawData, savePath);

This approach reduces parameter clutter and makes the function more flexible and maintainable.

 2. Design for change

Research software often needs to adapt to new hypotheses or methodologies. When writing a function, ask yourself “what additional functionalities might I need in the future?” and design your code accordingly. Implementing modular designs allows for easy modification and extension as research requirements evolve. Consider using dependency injection to make components more flexible and testable. This approach separates the creation of objects from their usage, making it easier to swap out implementations or add new features without affecting existing code.

Example: Modular design for a data processing pipeline

Instead of a monolithic script:

function runAllQualityMetrics(param, rawData, savePath)
% Hundreds of lines of code doing many different things
(...)
end

Create a modular pipeline that separates each quality metric into a different function:

function qMetric = runAllQualityMetrics(param, rawData, savePath)
nUnits = length(rawData);
for iUnit = 1:nUnits
% step 1: calculate percentage spikes missing
qMetric.percSpikesMissing(iUnit) = bc.qm.percSpikesMissing(param, rawData);
% step 2: calculate fraction refractory period violations
qMetric.fractionRPviolations(iUnit) = bc.qm.fractionRPviolations(param, rawData);
% step 3: calculate presence ratio
qMetric.presenceRatio(iUnit) = bc.qm.presenceRatio(param, rawData);
(...)
% step n: calculate distance metrics
qMetric.distanceMetric(iUnit) = bc.qm.getDistanceMetric(param, rawData);
end
bc.qm.saveQMetrics(qMetric, savePath)
end

This structure allows for easy modification of individual steps or addition of new steps without affecting the entire pipeline.

In addition, this structure allows us to define new parameters easily that can then modify the behavior of the subfunctions. For instance we can add different methods (such as adding the ‘gaussian’ option below) without changing how any of the functions are called!

param.percSpikesMissingMethod = 'gaussian';
qMetric.percSpikesMissing(iUnit) = bc.qm.percSpikesMissing(param, rawData);

and then, inside the function:

function percSpikesMissing = percSpikesMissing(param, rawData);
if param.percSpikesMissingMethod == 'gaussian'
(...)
else
(...)
end
end

3. Hide complexity

Expose only what’s necessary to use a module or function, hiding the complex implementation details. Use abstraction layers to separate interface from implementation, providing clear and concise public APIs while keeping complex logic private. This approach not only makes your software easier to use but also allows you to refactor and optimize internal implementations without affecting users of your code.

Example: Complex algorithm with a simple interface

For instance, in bombcell there are many parameters. When we run the main script that calls all quality metrics, we also want to ensure all parameters are present and are in a correct format.

function qMetric = runAllQualityMetrics(param, rawData, savePath)
% Complex input validation that is hidden to the user
param_complete = bc.qm.checkParameterFields(param);

% Core function that calcvulates all quality metrics
nUnits = length(rawData);

for iUnit = 1:nUnits
% steps 1 to n
(...)
end

end

Users of this function don’t need to know about the input validation or other complex calculations. They just need to provide input and options.

4. Write clear code

Clear code reduces the need for extensive documentation and makes your software more accessible to collaborators. Use descriptive and consistent variable names throughout your codebase. When dealing with specific quantities, consider adding units to variable names (e.g., ‘time_ms’ for milliseconds) to improve clarity. You can add comments to explain non-obvious logic and to add general outlines of the steps in your code. Following consistent coding style and formatting guidelines across your project also contributes to overall clarity.

Example: Improving clarity in a data processing function

Instead of an entirely mysterious function

function [ns, sr] = ns(st, t)
ns = numel(st);
sr = ns/t;

Add more descriptive variable and function names and add function headers:

function [nSpikes, spikeRate] = numberSpikes(theseSpikeTimes, totalTime_s)
% Count the number of spikes for the current unit
% ------
% Inputs
% ------
% theseSpikeTimes: [nSpikesforThisUnit × 1 double vector] of time in seconds of each of the unit's spikes.
% totalTime_s: [double] of the total recording time, in seconds.
% ------
% Outputs
% ------
% nSpikes: [double] number of spikes for current unit.
% spikeRate_s : [double] spiking rare for current unit, in seconds.
% ------
nSpikes = numel(theseSpikeTimes);
spikeRate_s = nSpikes/totalTime_s;
end

5. Design for testing

Incorporate testing into your design process from the beginning. This not only catches bugs early but also encourages modular, well-defined components.

Example: Testable design for a data analysis function

For the simple ‘numberSpikes’ function we define above, we can have a few tests to cover various scenarios and edge cases to ensure the function works correctly. For instance, we can test a normal case with a few spikes and an empty spike times input.

function testNormalCase(testCase)
theseSpikeTimes = [0.1, 0.2, 0.3, 0.4, 0.5]; totalTime_s = 1;
[nSpikes, spikeRate] = numberSpikes(theseSpikeTimes, totalTime_s);
verifyEqual(testCase, nSpikes, 5, 'Number of spikes should be 5');
verifyEqual(testCase, spikeRate, 5, 'Spike rate should be 5 Hz');
end

function testEmptySpikeTimes(testCase)
theseSpikeTimes = [];
totalTime_s = 1;
[nSpikes, spikeRate] = numberSpikes(theseSpikeTimes, totalTime_s);
verifyEqual(testCase, nSpikes, 0, 'Number of spikes should be 0 for empty input');
verifyEqual(testCase, spikeRate, 0, 'Spike rate should be 0 for empty input');
end

This design allows for easy unit testing of individual components of the analysis pipeline.

Part 2: Open Source Best Practices for Academia

While using version control and having a README, documentation, license, and contribution guidelines are essential, I have found that these practices have the most impact:

Example Scripts and Toy Data

I have found that the most useful thing you can provide with your software are example scripts, and even better, provide toy data that loads in your example script. Users can then quickly test your software and see how to use it on their own data — and are then more likely to adopt it. If possible, package the example scripts in Jupyter notebooks/MATLAB live scripts (or equivalent) demonstrating key use cases. In bombcell, we provide a small dataset (Bombcell Toy Data on GitHub) and a MATLAB live script that runs bombcell on this small toy dataset (Getting Started with Bombcell on GitHub). 

Issue-Driven Improvement

To manage user feedback effectively, enforce the use of an issue tracker (like GitHub Issues) for all communications. This approach ensures that other users can benefit from conversations and reduces repetitive work. When addressing questions or bugs, consider if there are ways to improve documentation or add safeguards to prevent similar issues in the future. This iterative process leads to more robust and intuitive software.

Citing

Make your software citable quickly. Before (or instead) of publishing, you can generate a citable DOI using software like Zenodo. Consider also publishing in the Journal of Open Source Software (JOSS) for light peer review. Clearly outline how users should cite your software in their publications to ensure proper recognition of your work.

Conclusion

These practices can help create popular, user-friendly, and robust academic software. Remember that good software design is an iterative process, and continuously seeking feedback and improving your codebase (and sometimes entirely rewriting/refactoring parts) will lead to more robust code.

To go deeper into principles of software design, I highly recommend reading “A Philosophy of Software Design” by John Ousterhout or “The Good Research Code Handbook” by Patrick J. Mineault.

Get involved! 

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Join our mailing list, and follow us on X, formerly Twitter and LinkedIn, to stay connected for updates, events, and opportunities.

 

 

 

Launching our new UCL Citizen Science Community on MS Teams!

By Kirsty, on 24 October 2023

Are you interested in citizen science?

Would you like to connect with others to share your stories about citizen science?

Are you wondering whether a citizen science approach might work for your project?

Would you like to collaborate with colleagues across UCL and exchange ideas or work together on participatory projects under a joint mission?

Do you want to hear about citizen science as an approach or would you just like to expand your network?

If so, please join our informal UCL citizen science community and get involved! Whether you are new to citizen science or whether you have run projects before, this is your community. We are bringing everyone together to share their knowledge, discuss good practices and talk about their experiences of citizen science.  The community is strictly for UCL members only but is open to all staff and students at all levels.

You might call it participatory research, community action, crowdsourcing, public engagement, or something else. UCL supports a broad approach to “citizen science”, recognising that there are different applications and functions of this approach in research, whether they are community-driven research projects or global investigations.

Through this community, we would really like to hear your feedback on what you would like to see from a potential citizen science support service at UCL including any ideas you might have for events, training, resources or anything else. Join here or search “UCL Citizen Science Community” on Teams.

We also have brand new and improved Citizen Science web pages on the UCL’s Office for Open Science and Scholarship website which includes an introduction to citizen science at UCL including definitions, history, types and levels, and information about the UCL Citizen Science Certificate, you can browse through the various types of citizen science projects at UCL and learn about what your colleagues are doing in this exciting area!

On our new citizen science support and training page, you will find links to relevant training courses currently delivered by different UCL teams both online and in person, a variety of useful resources about citizen science, links to interesting blogs/news and citizen science platforms and projects outside UCL. We will be improving and expanding this content within the coming months.

 If you have any questions, would like further information or would like to tell us what you need, please contact us!

Welcome to the new Training and Support Resources for Research site!

By Harry, on 11 July 2023

Since the UCL’s Office for Open Science and Scholarship founding, the team has been gathering resources to support researchers, academic staff, students, and everyone interested in learning and developing their skills and understandings about Open Science and the transition towards more democratic models to produce and share knowledge in equitable and inclusive ways.

The fast growth of Open practices and their translation into renewed local, regional, and global policies made necessary systematic resource realignments. During the last months, we have committed to re-organise those assets based on the myriad of users and their backgrounds far beyond the academic field, creating our new user-friendly website, ‘Training and Support Resources for Research’. Organised in the following up-to-date categories:

  • Advanced Research Computing: practical, hands-on training for various IT skills related to research, including high-performance computing, research software engineering and programming.
  • Citizen Science: our brand-new site with information about what UCL is doing in Citizen Science, innovative projects, and UCL’s Citizen Science Academy. Keep an eye on the Open@UCL Blog, where we will soon expand and go deep into the world of Citizen Science!
  • UCL Copyright advice: information and assistance by the UCL’s Library Services Copyright Team offers a wide range of copyright issues to UCL students and staff.
  • Creating Accessible Content: a compendium of simple steps to make your content more accessible and provide a more inclusive experience for all.
  • Doctoral Students Resources: a place for rigorous academic and non-academic creative researchers. Explore the resources and expand your skills to support your research, professional development and employability.
  • Information Governance: this site is for all members of UCL who manage highly confidential research information, including principal and chief investigators, staff, students, senior managers, and even those who just supervise people who directly handle confidential information and support staff who do not have direct access to data.
  • Open Access: designed to help UCL researchers understand how to make publications open access, meet open access requirements, use UCL’s Research Publications Service (RPS) and take advantage of open access funding.
  • Research Data Management and Planning: from the initial planning of a project through to archiving and sharing, the research data management team advises the UCL community on managing research outputs – across the research data lifecycle – in line with UCL’s expectations and external funding agencies’ requirements.
  • Research Funding Management: learn more about post-award processes through the online training course of the Fundamentals of Research Funding Management.
  • Research Integrity: summary of training opportunities currently available for staff and students. This list is not exhaustive and is intended to provide guidance as to options available. It will also be updated so do re-visit this page.
  • Research Transparency: research transparency covers how we ensure our research is responsible, reproducible, open and evidence-based.

If you scroll down the website, you will also find UCL’s Organisational Development training, Short Courses and some of UCL’s Communities and Forums that you can join to share your research, get advice and learn something new. We also collated an overview of the Research Support Teams!

Stay tuned to our news, events and training opportunities by subscribing to our mailing list,  following us on Twitter @UCLopenscience, or getting in touch with the Office for Open Science, and one of our teammates will answer as soon as we can!

New Open Science Resources for 22-23

By Kirsty, on 21 September 2022

The team at the UCL Office for Open Science & Scholarship are pleased to be able to share two brand new resources for the 2022-23 academic year!

Open Science & Scholarship video

The first, a new video giving an overview of Open Science at UCL. Dr Paul Ayris, Pro-Vice Provost for Library, Culture, Collections & Open Science (LCCOS) opens the discussion and outlines the context, before local experts delve into each of the core areas that we support. It is accompanied by a short quiz to affirm your new understanding of each area.

This video was created with the support of grants from Research England and Horizon 2020. It was produced by Scientific Knowledge Services (SKS).

After watching the video, you can take the quiz! If you are interested in learning more about anything you heard, visit the OOSS training and support pages for this and further resources.

Guide to Open Science for PhD students

The second resource is entitled Open Science – a practical guide for PhD students. This guide has been designed especially for UCL’s community of PhD students, aiming to introduce the principles of Open Science, but linking them directly to stages of the PhD journey and showing the benefits of embracing Open from the start of their academic journey.

This guide is developed from the original which was published by the French Ministry of Higher Education and Research.

This and other resources to help and support you in your Open Science journey can be found on our website.

 

Altmetric – now available at UCL!

By Kirsty, on 2 September 2022

Guest post by Andrew Gray (Bibliometrics Support Officer)

What is it? 

Altmetrics are the concept of “alternative metrics” – measuring the impact of research beyond scholarly literature. It encompasses a wide range of activity in diverse sources  social media (eg twitter, blogs), news publications, and grey literature (eg policy documents). This can help to get a wider sense of the impact of papers that might otherwise be missed were we to focus just on traditional academic citations. 

The primary commercial database for these is Altmetric (https://altmetric.com) – UCL has just taken out a one-year subscription to this service. We hope it will be useful for anyone interested in public engagement or research impact, as well as individual researchers looking at the response to their own work. 

It is open to everyone at UCL by visiting https://www.altmetric.com/explorer/login and entering your UCL email address. It will then authenticate through the UCL single-sign-on system. 

How does it work? 

Altmetric tracks a range of individual sources looking for DOIs, links to papers, or free-text descriptions of articles. It then matches these to the underlying paper and produces an index of the mentions. Here we can see the range of responses to a climate-change study. 

You will also sometimes see this coloured “doughnut” on publisher or repository sites – clicking through will get you to this same page. 

The most interesting part of the service, however, is the dashboard. This aggregates the results from all individual papers, and we can then filter down by subject area, date, publication venue, etc., to produce a more specific analysis. It is also possible to search for keywords to see the change in activity around a specific topic – one like “artificial intelligence” tends to show a steady level of interest, while one like “gravitational waves” shows very dramatic spikes connected with major discoveries. 

What can we do with it? 

The dashboard has been integrated with UCL’s RPS service, so it has a dataset of UCL papers since 2013, each linked to the faculty/department of the authors. This means we can do the same types of analysis for just UCL papers – or just those from a specific department or a specific author. 

The search can also be tweaked to identify specific topics. Here we can see policy documents published in 2022 which cite a Bartlett paper. 

Policy documents are one of the key strengths of Altmetric – they can be used as evidence of wider impact, especially for the social sciences. While they are formal documents, and very distinct from more ephemeral news or social media mentions, they are not indexed in most citation databases and so this impact can often be hard to trace. 

Altmetric data can also be exported – any set of results can be exported so that we can do detailed offline analysis of sets of papers, or at the individual mentions that make up the score. This data includes identifiers such as DOIs and ISBNs, meaning it can be linked up to other datasets easily 

What next? 

We are very keen to get this tool in the hands of as many people at UCL as possible and find how it can be used most effectively. Please have a go and let us know what you think! 

UCL-specific training and guidance is currently under development, and will be published in September 2022. Until then, please feel free to get in touch with the team (bibliometrics@ucl.ac.uk) with queries or requests for assistance. We are happy to arrange training as well. 

The tool is currently provided with a static dataset drawn from RPS, covering papers published 1 January 2013 up to 12 August 2022. We are working with the providers to improve the integration so that it will include “live” data, refreshed from RPS every night; until then, we plan to make periodic updates so that publications are added on a rolling basis. 

 

 

 

Love Data Week – Sharing data? Your questions answered

By Kirsty, on 10 February 2021

Guest post by James Houghton, Research Data Support Officer


Dealing with research data, and the associated legal and administrative issues, can be confusing. This article responds to some of the frequent question and confusions people have regarding research data management.

Do I always have to share data?

Not always – but in general data sharing is required unless you have a very good reason not to and UCL expects research to be shared as widely as possible. Data sharing is possibly inappropriate in the following situations:

  • The project contains personal data which could compromise the privacy of individuals. In this case the Data Protection Act (2018) applies and the data cannot be shared.
  • There is a possibility that the research could be commercialised. In this case, data should not be shared before obtaining necessary patent protections.
  • Other ethical concerns for which a justification can be created. For example, data on an endangered species might be used by poachers so it would be reasonable not to share this data.

If you are ever unsure about releasing data, speak to someone before you proceed. The Library RDM team and the Data Protection Team can advise on this.

Does UCL have a data sharing policy?

Yes, and it specifies the expectations placed on all UCL staff and students on making data available.

Be aware that in addition to the UCL policy, funding agencies will have their own requirements. You need to be compliant with all policies that might apply!

So, I need to share my data. Does UCL have a platform for data sharing?

Yes, we do! UCL has its own data repository service, the UCL Research Data Repository

I don’t have any data.

The term “data” is used as a shorthand to cover all research outputs, so even if you think you don’t have data, you probably generate something during the course of your research that should be preserved and potentially shared. Even if your field uses a different term you are probably still bound by the data sharing policy.

Here’s is a wide-ranging list of what could be considered “research data”

  • Research notebooks, detailing progress of research and experiments
  • Responses to surveys and questionnaires
  • Software, code, algorithms, and models
  • Measurements from laboratory or field equipment
  • Images (such as photographs, films, scans of documents)
  • Methods, protocols, and experimental procedures
  • Databases of collected information
  • A corpus of writings
  • Audio and video recordings
  • Interview Transcripts
  • Physical samples and objects

If you have an output not included in this list, it could can still be classed as research data!

What on earth is metadata?

Metadata is simply data that describes other data. Here are a few examples:

  • A description of the inclusion criteria for enrolling participants in a study
  • The set of questions used in interviews
  • Any file naming conventions used to keep track of data
  • The parameters used by any equipment used to make measurements
  • The dates and times images were taken
  • Details of quality assurance steps to explain why some data points were deemed to be erroneous and unsuitable for analysis
  • Administrative information such as dates of interviews, experiments or visits to a location

This is not an exhaustive list by any means! Metadata can vary considerably between projects and research fields.

In the same way data might underpin the results of a project, metadata could be said to underpin the methods of a project. If you need to address the issue of metadata, think about what another researcher would need to know to replicate the data as closely as possible.

What resources can I access at UCL to store data safely?

All UCL IT managed storage services have automated backups in place to protect data and are recommended over using your own personal devices or individual cloud storage accounts. There are a few different options depending on your needs:

  • The personal N: drive or S: drives are fine for day-to-day storage of PDFs, office documents and non-sensitive materials.
  • The Research Data Storage Service supports high speed file transfer for large quantities of data and is extremely useful for anyone who want to work with the high-performance computing clusters.
  • The Data Safe Haven is specifically designed to store personal data covered by the Data Protection Act 2018. This secure service helps you meet legal obligations on data security when relevant.
  • Services such as SharePoint and OneDrive can be useful for collaboration with colleagues and allow for functionality such as simultaneous editing of documents.

Need more information?

We have extended guidance on research data management available on our website and the library research data management team can be contacted to discuss specific issues at: lib-researchsupport@ucl.ac.uk

UCL researchers respond to COVID-19 pandemic

By Patrycja, on 21 April 2020

UCL researchers are accustomed to working across disciplines, with colleagues from many different institutions, to help address the biggest challenges facing the world today. It’s no different with the COVID-19 crisis – though now their work is in the public eye as perhaps never before.

UCL clinical academics have joined frontline medical staff in fighting the outbreak and UCL is providing resources for NHS medical staff. Our researchers are developing rapid tests and tracking systems for COVID-19 and are taking a prominent role in advancing public knowledge about the virus.

Many UCL academics are already releasing papers analysing the outbreak, case studies, predictions about the course of the pandemic and assessments of its economic, health and social implications. In a global crisis, public access to high-quality scientific information is critical. Some publishers have introduced special arrangements to make COVID-19 publications openly available during the pandemic. UCL authors also make their papers openly available UCL Discovery, UCL’s open access repository, where they are curated and kept open access in perpetuity.

In the first of what we hope will be a series of regular posts, we are featuring the latest outputs by UCL academics available in the repository.

A commentary by Diana Margot Rosenthal, Marcella Ucci, Michelle Heys, Andrew Hayward, Monica Lakhanpaul that analyses impact of COVID-19 on families experiencing homelessness: discovery.ucl.ac.uk/id/eprint/1009

Ali Zumla, from UCL Department of Infection, co-authored a paper that analyses imaging findings of the first two patients identified in Italy with COVID-19 infection: https://discovery.ucl.ac.uk/id/eprint/10094977

Andrew Hayward from the Research Department of Epidemiology & Public Health, Sarah Beale from Institute of Health Informatics and Anne M. Johnson from the Institute of Global Health analyse the implications of social distancing to control the pandemic: discovery.ucl.ac.uk/id/eprint/1009 This article is also available on Wellcome Open Research –  a megajournal platform with open peer-review. 

Another article by Andrew Hayward, Sarah Beale and Anne M. Johnson on seasonality seasonality and immunity to laboratory-confirmed seasonal coronavirus is also available for open peer-review on Wellcome Open Research platform. The dataset supporting this article is available in UCL Discovery: https://discovery.ucl.ac.uk/id/eprint/10093909/

Jayant Vaidya, Professor of Surgery & Oncology, has co-authored an article describing methods of reducing infection and rationalising workloads. It’s available in UCL Discovery here: discovery.ucl.ac.uk/id/eprint/1009

For more on COVID-19 research at UCL, please see our webpages here: https://www.ucl.ac.uk/covid-19-research/

Open Access Button

By Kirsty, on 26 March 2020

One of the most frustrating aspects of doing research is when you come up against an article you can’t get at without paying. Even with the wide range of databases and journals that the library subscribes to, coverage is not complete. JISC, SPARC, Open Society Foundations, the Centre for Open Science and many others have worked together to create a solution to this problem called the Open Access Button.

What does it do?

The Open Access Button tool tells you if there are free (and legally available) copies of articles available as you go along – without you having to search them out. The tool, once installed, searches an extensive collection of existing repositories and aggregators in the background of your browser and indicates when it finds an Open Access version of something using a discrete icon on your screen.

They also provide another option – a button that you can install on your browser to run a quick search for Open Access versions of something.

Why do I need it?

The Open Access Button team support the Open Access movement and believe that outputs of publicly funded and supported research should be openly and freely available for use by the public and by other researchers.

There has been a lot of work in the last few years to increase the amounts of Open Access content available online. The number of works which are available open access is growing every day, but many are still only available to those that can pay subscriptions.  Not only does this mean that only rich institutions can have access to the results of research, but also, public resources that could be used to develop research are spent just to read the work that has already been undertaken.

The role of the Open Access Button is to make it easier to access works already freely available by allowing a single point of search for the numerous repositories out there, assuring what you find is legal and from a reliable source. At the same time, it is identifying restricted works and working with researchers to release their full potential for the public good by allowing you to request copies of works that are not yet Open Access.

How does it work?

When you find an article, the Open Access Button tool uses the information on the page (the bibliographic metadata) to search its approved sources for an open Access copy of the work. Sources include most of the major global aggregated repositories. Such as:

  • OA DOI which provides the data behind Unpaywall, an app that leads straight to legitimate author uploaded versions of the publisher’s articles like the OA Button.
  • SHARE, a US service developed by the Association of Research Libraries in partnership with the Center for Open Science
  • CORE which offers “seamless access to millions of open access research papers, enrich the collected data for text-mining and provide unique services to the research community.”
  • OpenAIRE, a European resource that offers an OA search engine and a campaign platform driving Open Access development and policy.
  • Dissemin, a French resource with a slightly different approach: “ Dissemin searches for copies of your papers in a large collection of open repositories and tells you which ones cannot be accessed”
  • Europe PMC which specialises in life sciences research
  • BASE a Germany based aggregator.

In addition, if you ask the Open Access Button to search for an article that is not available openly, a request is sent to the author asking them to share. The service is able to support the authors in sharing the article quickly and legally.

Is it legal?

The Open Access Button will only show you legal, freely available copies.  Your assurance of this comes from the sources they use and the supporters of the initiative.  These include:

How can I get it?

Check your Browser

Open Access Button works with Chrome, FireFox and Safari.  It is less successful with Internet Explorer or Microsoft Edge.

Ways to use the Open Access Button

  1. Use the Search Engine: On the homepage, enter any part of a bibliographic citation and the search engine will seek out an open access copy – if one exists.
  2. Use the Button: Add the Open Access Button extension for unpaywall to your browser.  Whenever you land on a journal abstract page for a work or find a reference in Google scholar the icon on the right-hand-side of your screen will tell you if the work is available and why.  If the work was self-archived on an institutional, funder or subject repository, then the icon will be green, if it is open access on the publisher pages, the icon will be Gold-coloured. If it doesn’t automatically identify the status, you can click the button to do a search manually.