X Close

Open@UCL Blog

Home

Menu

Archive for the 'Guest post' Category

Announcing: UCL’s first Replication Games

By Kirsty, on 17 February 2025

Registrations are now open for UCL’s first Replication Games, organised by the Office for Open Science & Scholarship and UCL’s UKRN local network chapter. The event will be run by the Institute for Replication (I4R), and it is supported by a Research Culture Seed Grant.

The Replication Games is a one-day event that brings together researchers to collaborate on reproducing and replicating papers published in highly regarded journals. Researchers participating in the Replication Games will join a small team of 3-5 members with similar research interests. Teams verify the reproducibility of a paper using its replication package. They may conduct sensitivity analysis, employing different procedures than the original investigators.  Teams may also recode the study using the raw or intermediate data or implement novel analyses with new data. More information can be found on I4R’s Website.

Teams will be guided in all activities by Derek Mikola, an experienced facilitator from the I4R. After the event, teams are encouraged to document their work in a report that will be published on the website of the I4R. Participants are also eligible to be granted co-authorship in a meta-paper that combines a large number of replications.

This event takes place in person. Lunch and afternoon snacks are provided.

Who are we inviting to register?

Registration is on a ‘first come, first serve’ basis. We invite MRes students, doctoral students and researchers, post-docs, and faculty members at UCL to apply. Although students and scholars from all disciplines can apply, we hope to attract especially those working in the social sciences and humanities.

Participants must be confident using at least one of the following: R, Python, Stata, or Matlab.

Papers available for replication are listed on the I4R website. Prospective participants are asked to review this list to ensure that at least one paper aligns with their research interests.

How to apply?

Please complete this short form: https://forms.office.com/e/WEUUKH2BvA

Timeline and Procedure

  • 15 March 25 – registrations close
  • 31 March 25 – notification of outcomes and teams
  • 7 April 25,  1pm – Mandatory Teams call with the I4R (online)
  • 25 April 25, 9am-5pm – Replication Games (at UCL’s Bloomsbury Campus)

Please note that participants are expected to attend the full day.

Contact

If you have any questions, please contact Sandy Schumann (s.schumann@ucl.ac.uk)

Whose data is it anyway? The importance of Information Governance in Research

By Kirsty, on 11 February 2025

Guest post by Preeti Matharu, Jack Hindley, Victor Olago, Angharad Green (ARC Research Data Stewards), in celebration of International Love Data Week 2025

Research data is a valuable yet vulnerable asset. Research data is a valuable yet vulnerable asset. Researchers collect and analyse large amounts of personal and sensitive data ranging from health records to survey responses, and this raises an important question – whose data is it anyway?

If data involve human subjects, then participants are the original owners of their personal data. They grant permission to researchers to collect and use their data through informed consent. Therefore, responsibility for managing and protecting their data, in line with legal, regulatory, ethical requirements, and policies lie with researchers and their institution. Hence, maintaining a balance between participant rights and researcher needs.

Under the General Data Protection Regulation (GDPR) in the UK and EU, participants have the right to access, update and request deletion of their data, whilst researchers must comply with the law to ensure research integrity. However, under the Data Protection Act, research data processed in the public interest must be retained irrespective of participant rights, including the rights to erase, access and rectify. UCL must uphold this requirement while ensuring participant confidentiality is not compromised.

Information governance consists of policies, procedures and processes adopted by UCL to ensure research data is managed securely and complies with legal and operational requirements.

Support for information governance in research is now provided by Data Stewards within ARC RDM IG. That’s a long acronym, let’s break it down.

  • ARC: Advanced Research Computing – UCL’s research innovative centre and provides 1. Secure digital infrastructure and 2. Teaching software.
  • RDM: Research Data Management – assist researchers with data management.
  • IG: Information governance – advise researchers on compliance for managing sensitive data.

Data Stewards – we support researchers with data management throughout the research study, provide guidance on data security awareness training, data security requirements for projects, and compliance with legal and regulatory standards, encompassing the Five Safes Framework principles. Additionally, we advise on sensitive data storage options, such as a Trusted Research Environment (TRE) or the Data Safe Haven (DSH).

Furthermore, we emphasise the importance of maintaining up-to-date and relevant documentation and provide guidance on FAIR (Findable, Accessible, Interoperable, Reusable) data principles.

As stated above, data can be vulnerable. UCL must implement strong security controls including encryption, access control and authentication, to protect sensitive data, such as personal health data and intellectual property. Sensitive data refers to data whose unauthorised disclosure could cause potential harm to participants or UCL.

UCL’s Information Security Management System (ISMS) is a systematic approach to managing sensitive research data to ensure confidentiality, integrity, and availability. It is a risk management process involving people, processes and IT systems. The key components include information management policy, identifying and assessing risks, implementing security controls to mitigate identified risks, training users and continuous monitoring. The ISMS is crucial in research:

  1. It protects sensitive data; without stringent security measures, data is at risk of being accessed by unauthorised individuals leading to potential theft.
  2. It ensures legal and regulatory compliance i.e. GDPR and UCL policies. Non-compliance results in hefty fines, legal action and reputational damage.
  3. Research ethics demand participant data is handled with confidentiality. The ISMS ensures data management practices, data anonymisation, and controlled access whilst reinforcing ethical responsibility.
  4. It reduces the risk of phishing attacks and ransomware.
  5. It ensures data integrity and reliability – tampered or corrupted data can lead to invalid research and waste of resources.

UCL practices for Information Governance in research:

In response to the question, whose data is it anyway? Data may be generated by participants, but the overall responsibility to use, process, protect, ethically manage lies upon the researchers and UCL. Additionally, beyond compliance and good information governance, it is about ensuring research integrity and safeguarding the participants who make research possible.

Reflections on the UCL Citizen Science Community Event

By Kirsty, on 11 December 2024

Post provided by Sheetal Saujani, Citizen Science Support Officer.
Image taken from among the audience of a presentation looking towards the screen. The text on the screen is unreadable but the audience look attentive and some are taking notes. Professor Muki Haklay can be seen standing at a lectern to the left of the screen.

This week’s UCL Citizen Science Community Event brought researchers, academics, students and staff from across UCL together to celebrate citizen science and participatory research. The slides and programme are available online so that everyone can follow up with the references.

The morning was packed with engaging talks and workshops, highlighting unique perspectives and approaches to citizen science. From discussions about an innovative mapping tool and the role of the UCL Urban Room in preserving local memories to hands-on workshops that sparked new ideas, the event demonstrated the diversity and potential of citizen-led research. One highlight was seeing the genuine passion from the presenters, whose enthusiasm was contagious.

At the start of the event, the Citizen Science team at the Office for Science and Scholarship and the UCL Co-Production and Public Engagement teams showcased their services and support including training, resources and advice. Their dedication to supporting citizen science and public engagement left everyone excited and ready to get involved. Both teams are eager to hear from the UCL community about their projects and the support they may need, so they encourage everyone to get in touch.

We also heard from Marcos Moreu and Fabien Moustard from the Extreme Citizen Science Research Group (ExCites) about a new tool called Kapta that they have been involved in the design for, allowing mapping data to be collected simply using WhatsApp, allowing small or remote communities to collect data without extensive knowledge or additional training. More information can be found by looking at the slides above or on Kapta.earth.

Professor Muki Haklay delved into the topic of skills and competencies in citizen science, exploring how competency frameworks define the skills, knowledge, and values needed for success in this field. His talk highlighted frameworks like the UCL Academic Careers Framework, Vitae’s Researcher Development Framework, and ECS Academy’s Research Competencies framework, which outline progression paths, accreditation standards, and expectations at foundational to expert levels. These frameworks not only codify what individuals need to know but also guide learning, recruitment, and career development. A live Mentimeter poll engaged attendees, inviting their perspectives on key skills, competencies and areas for development, sparking conversations about advancing expertise in citizen science.

A group of people photographed from behind. The people are looking towards groupings of coloured post-it notes on a wall. The woman closest to the camera is wearing a dark jumper and scratching her head.Dr Izzy Bishop led an interactive session where participants contributed their thoughts on the various elements required to include citizen science in a project proposal. The flipchart became a dynamic map, highlighting key areas such as co-design, engagement, training, communications, technology, volunteer management, data analysis, and impact. Each participant helped to identify UCL’s expertise and interest in these areas, promoting collaboration and a shared understanding.

We also discussed strategies for applying for and sustaining project funding, with a focus on Horizon Europe opportunities. Professor Muki Haklay provided valuable insights on successfully navigating the application process and securing support for citizen science initiatives.

Hearing Jo Baines discuss UCL’s Urban Room, and the innovative Memory Workshop was truly inspiring. These projects showcase the power of engaging communities, including underrepresented groups, to explore shared histories and reimagine urban spaces. The Memory Workshop and Memory Bike demonstrate citizen science in action, amplifying diverse voices, and uncovering new insights through creative, participatory research. By equipping participants with skills and tools that extend beyond the projects, they highlight how citizen science connects people and ideas to drive meaningful change.

We were captivated by voices from two remarkable projects Flotilla by Melanie Manchot and Breaking Waves, which highlight the power of art and storytelling to link communities and preserve collective memory. Both reflect the spirit of citizen science by fostering connections and inspiring new perspectives. The links to experience these projects for yourself are available in the slides linked above.

Before the event closed, there was a fun and relaxed speed networking session. The vibrant spirit of session encouraged meaningful conversations and connections, reminding us of the value of sharing knowledge and experiences across disciplines and communities!

A huge thank you to all the speakers, workshop leaders, and participants who made the event so worthwhile. We hope everyone left as inspired as we did to continue championing citizen science!

Copyright and Open science in the age of AI: what can we all do to ensure free and open access to knowledge for all?

By Rafael, on 24 October 2024

We are approaching the end of International Open Access Week, and we have been enjoying a series of interesting insights and discussions across UCL!  Earlier this week, we explored the balance between collaboration and commercialisationhighlighted the important work of Citizen Science initiatives and the growing significance of open access textbooks.

Today, Christine Daoutis, UCL Copyright Support Officer, will build on our ongoing series about copyright and open science, focusing on how we can ensure free and open access to knowledge in the age of AI, by addressing copyright challenges, advocating for rights retention policies, and discussing secondary publication rights that benefit both researchers and the public.


Open Access Week 2024 builds on last year’s theme, Community over Commercialisation, aiming not only to continue discussions but to take meaningful action that prioritises the interests of the scholarly community and the public. This post focuses on copyright-related issues that, when addressed by both individual researchers and through institutional, funder, and legal reforms, can help create more sustainable and equitable access to knowledge.

Infographic promoting Plan S for rights retention strategy. It features an illustration of people climbing ladders towards a large key, symbolising control over open access to knowledge. The text reads: "By exercising your rights, you can share your knowledge as you wish and enable everyone to benefit from your research." The hashtag #RetainYourRights is included in the middle section.

 Rights retention infographic. Source: cOAlition-s

Retaining author rights

Broadly speaking, rights retention means that authors of scholarly publications avoid the traditional practice of signing away their rights to publishers, typically done through a copyright transfer agreement or exclusive licence. Instead, as an author, you retain at least some rights that allow you to share and reuse your own research as openly as possible. For example, you could post your work in an open access repository, share it on academic networks, reuse it in your teaching, and incorporate it into other works like your thesis.

Many funders and institutions have specific rights retention policies that address related legal issues. If such a policy applies, and publishers are informed in advance, authors typically need to retain rights and apply an open licence (usually CC BY) to the accepted manuscript at the point of submission.

Rights retention ensures that your research can be made open access without relying on unsustainable pay-to-publish models, and without facing delays or restrictions from publishers’ web posting policies. Importantly, rights retention is not limited to published research—it can be applied to preprints, data, protocols, and other outputs throughout the research process.

Secondary Publication Rights (SPRs)

Secondary publication rights (SPRs) refer to legislation that allows publicly funded research to be published in an open access repository or elsewhere, at the same time as its primary publication in academic journals. Some European countries already have SPRs, as highlighted by the Knowledge Rights 21 study conducted by LIBER, and LIBER advocates for #ZeroEmbargo on publicly funded scientific publications. There are ongoing calls to harmonise and optimise these rights across countries, ensuring that the version of record becomes immediately available upon publication, overriding contractual restrictions imposed by publishers.

SPRs can apply to different types of research output and are meant to complement rights retention policies. However, introducing SPRs depends on copyright reform, which is not an action individual researchers can take themselves, though it’s still useful to be aware of developments in this area.

The image is a digital collage featuring a blue and green silhouette of a human head composed of circuit patterns on the right. The left side of the background is filled with various tech-themed icons surrounding a prominent "MACHINE LEARNING" label. A hand reaches towards the different icons, interacting with and exploring AI concepts

Source: Computer17293866, CC BY-SA 4.0, via Wikimedia Commons

Artificial Intelligence and your rights

The rise of Generative AI (GenAI) has introduced broader issues affecting researchers, both as users and as authors of copyrighted works. These include:

  • Clauses in subscription agreements that seek to prevent researchers from using resources their institution has subscribed to for AI-related purposes.
  • Publishers forming agreements with AI companies to share content from journal articles and books for AI training purposes, often without clear communication to authors. A recent deal between Taylor & Francis and Microsoft for $10 million has raised concerns among scholars about how their research will be used by AI tools. In some cases, authors are given the option to opt in, as seen with Cambridge Press.
  • For works already licensed for reuse, such as articles under a CC BY licence or those used under copyright exceptions, questions arise about how the work will be reused, for what purposes, and how it will be attributed.

While including published research in AI training should help improve the accuracy of models and reduce bias, researchers should have enough information to understand and decide how their work is reused. Creative Commons is exploring ‘preference signals’ for authors of CC-licensed works to address this issue.

The key issue is that transferring your copyright or exclusive rights to a publisher restricts what you can do with your own work and allows the publisher to reuse your work in ways beyond your control, including training AI models.

Using Copyright exceptions in research

UK copyright law includes exceptions (known as ‘permitted acts’) for non-commercial research, private study, criticism, review, quotation, and illustration for instruction. As a researcher, you can rely on these exceptions as long as your use qualifies as ‘fair dealing’, as previously discussed in a blog post during Fair Dealing Week. Text and data mining for non-commercial research is also covered by an exception, allowing researchers to download and analyse large amounts of data to which they have lawful access.

Relying on copyright exceptions involves evaluating your purpose and, for some exceptions, making a decision around what is ‘fair’. This also involves some assessment of risk. Understanding copyright exceptions helps you exercise your rights as users of knowledge and make confident assessments as to whether and when a copyright exception is likely to apply, and when permission is necessary. [see links for UK legislation at the end of this article]

The hands of diverse individuals hold up large, colorful letters spelling "COPYRIGHT" against a light blue background. Each letter features a different bright color, creating a vibrant and playful display.

Source: www.freepik.com

Engage with copyright at UCL

The conversations sparked during Open Access Week continue throughout the year at UCL as part of ongoing copyright support and education. To engage further with these issues, you can:

Useful Legislation

OA Textbooks: UCL Press Perspective

By Rafael, on 23 October 2024

As we continue celebrating International Open Access Week, it’s inspiring to see how open access is shaping research and collaboration across UCL! Earlier this week, we explored the balance between collaboration and commercialisation, and highlighted the important work of Citizen Science initiatives.

Midway through the week, Dhara Snowden, Textbook Programme Manager at UCL Press, shares her insights below on the growing significance of open access textbooks and their role in the broader landscape of open access publishing.

Logo for International Open Access Week featuring a stylized orange open lock, symbolizing open access. The text "International Open Access Week" is displayed in gray, black, and orange, with modern typography.


UCL Press is the UK’s first, and largest, fully open access university press. We have been publishing books and journals for almost ten years, covering a wide range of subjects and formats. In 2021, the press launched a new programme focusing on open access (OA) textbooks. I joined at that time, and over the past three years, I’ve been responsible for managing our progress in this area. As part of this year’s International Open Access Week, I wanted to share some of our thinking and planning around this activity and offer some musings about OA textbook publishing in general and what the future holds.

What are Open Access Textbooks?

Firstly, it’s useful to clarify what we mean by “textbooks.” Through conversations with academics across faculties, I’ve realised this term can mean different things depending on the context and level of study. In the broadest sense, a textbook is a resource that supports the delivery of a course or module. Textbooks can provide a wide-view survey of a field or subject, to be “adopted” as the main guide for study, or they can be part of a reading list that includes chapters from various sources, like journal articles.

Although textbooks were traditionally published in print, there has been increasing demand for digital versions of commercial textbooks, which can be purchased by individuals and licenced to institutions for use for multiple users.

Open access publishing, meanwhile, involves making scholarly content freely available online, removing subscription or purchase barriers. In the context of textbooks, this means students and educators can access high-quality educational resources without significant costs. OA textbooks are typically published under a Creative Commons (CC) licence, which allows for redistribution, adaptation, and modification, promoting a collaborative and inclusive educational environment.

The creation and uptake of OA textbooks has seen a sharp increase in recent years, particularly in the US and UK, with non-profit-funded publishers like OpenStax, collaboratively funded projects like CORE Econ, and platforms like Pressbooks. The Open Textbook Library, supported by Open Education Network, currently indexes 1,542 published open textbooks.

Why do we create them?

The UCL Press textbook programme was set up in direct response to issues around pricing for institutional access to essential e-textbooks, which were exacerbated during the Covid-19 pandemic. The current ecosystem presents an unstable and unsustainable financial model for institutional libraries, as well documented by the eBooks SOS campaign, which calls attention to the lack of regulation in pricing by commercial publishers.

An article published by Insights in 2022,Perspectives on e-books and digital textbooks and the way ahead’, claims that ‘combined spending on book across nationally negotiated library purchasing frameworks increased from £55M in 2019/20 to £73M in 202/21, with e-textbook provision increasing by 281% to £25.1M during that time’.

In addition to concerns about affordability and sustained access, the Insights article outlines that post-pandemic, “shifts in teaching practice are accelerating demand for features that enhance blended learning”, with more flexibility and adaptability in resources being required, which isn’t being delivered by traditional academic publishing.

UCL Press’s aims to disrupt the current academic publishing ecosystem, offering authors and readers an alternative to the commercial model. This connects the theme for OA Week 2024, which calls for community over commercialisation. Bringing publishing back to the hands of academy, we can provide sustainable and high-quality textbooks to facilitate hybrid teaching and remove barriers to access for our content thereby reaching the widest possible audience and increasing chances to impact in scholarly communities.

How do we create them?

The UCL Press textbook programme commissions and publishes textbooks for undergraduate and postgraduate students across a wide range of subjects and topics. Every new proposal (and final manuscript) undergoes a rigorous peer-review process to ensure high-quality and relevant content.

Our approach is to collaborate with lecturers to create resources that provide high-quality guidance for students. Taking a personal and flexible approach to each project, we avoid rigid templates or a one-size-fits-all mentality, tailoring our textbooks to the needs of students and subject matter.

The cover of the textbook An Introduction to Waste Management and Circular Economy by Stijn van Ewijk and Julia Stegemann features a collage of industrial waste, recycling materials, and abstract patterns symbolizing the circular economy. The background transitions from deep blue to green, reflecting environmental themes. The title appears in bold white font at the top, with the authors' names below in smaller text. A UCL Press logo is centered at the bottom, giving the cover a modern, academic look.The cover of Methods and Methodologies in Heritage Studies, edited by Rachel King and Trinidad Rico, features a sequence of black-and-white images of a horse and rider in motion, evoking a sense of movement and time in line with heritage studies. The UCL Press logo is centered at the bottom.

To date, we’ve published two textbooks. The first, An Introduction to Waste Management and Circular Economy, came out in December 2023. This textbook supports a module taught at UCL but also has global relevance for courses in environmental engineering, resource efficiency, bioenergy, and waste-to-energy technologies.

More recently, we published  Methods and Methodologies in Heritage Studies, an edited collection exploring the disciplinary debates, intellectual legacies, and practical innovations that shape contemporary understandings of heritage value.

 

A bar graph from UCL Press shows monthly textbook access statistics, illustrating a steady increase in engagement from December 2023 to September 2024. A world map accompanies the graph, highlighting global reach, with countries shaded to indicate varying access levels. The U.S., in dark green, shows the highest access, while other countries are represented in lighter shades, denoting different degrees of interaction worldwide.

 

Together, these two titles have been downloaded over 12,000 times in 152 countries and territories. Our forthcoming titles include, A Guide to Performing Systematic Reviews of Health and Disease and Fundamentals of Dark Matter, both due to publish early next year.

What are the benefits of writing or using OA textbooks? Where’s the value?

There are many benefits to writing and using OA textbooks and the European Network of Open Education Librarians have created a toolkit to encourage use of OA materials and Open Educational Resources (OER). Some key points are listed below.

  • Reaching a Global Audience: downloads and views from readers across the globe, particularly the Global South.
  • Cost-Effectiveness: One of the most significant advantages of OA textbooks is their cost-saving potential for both library and student. OA resources can alleviate this burden, allow to redistribution of saved funds and make education more accessible for all.
  • Adaptability: open licences enable reuse, modification and adaptation, enabling educators to make the content work best for teaching.
  • Showcase Teaching Excellence: OA textbooks can help platform new approaches or area of study, and celebrate examples of teaching excellence.
  • Encourage lifelong learning: Provide students with resources they can use and reference after their studies and into their careers.
  • Accessibility and Inclusivity: Open access textbooks are available to anyone with an internet connection, supporting diverse learners, including those in remote or under-resourced areas, and those outside academic institutions(e. professionals and policy makers)
  • Up-to-Date Content: Traditional textbooks can quickly become outdated. OA textbooks can be updated and revised more readily, ensuring that students have access to the most current information.
  • Enhanced Collaboration: Open resources encourage collaboration among educators, fostering a community of shared knowledge and innovation.

To measure impact, we use both qualitative and quantitative measures. Our Statistics dashboard shows the readership and reach of our books, including a map of access. In addition, we are collecting testimonials and feedback from academics and students and engaging with the societal impact of our books (as discussed in a recent article in The Scholarly Kitchen). We interrogate our user analytics to understand which communities are interacting with our content and how they are using it in their own work.

Value in this context is calculated not only in terms of cost-saving on commercial provision, but also in the added value student experience. This includes showcasing teaching excellence to encourage enrolment in a particular course or providing a teaching and learning resources for a module that are underrepresented in commercial provision (i.e. those with smaller cohorts/interdisciplinary topics/less mainstream perspectives).

What does the future hold?

The future of open access textbook publishing in the UK looks promising, with increasing institutional support and growing awareness among educators. As the demand for affordable and accessible educational resources continues to rise, the potential for OA textbooks to reshape higher education is significant.

Open access textbook publishing represents a vital shift in the academic landscape, providing a sustainable, equitable, and collaborative approach to education. As more institutions and publishers embrace this model, we hope to create a future where quality educational resources are accessible to all, empowering students and educators alike.

Significant sea change in the status quo requires a long-term outlook, and significant investment and commitment. If educators, students, and policymakers continue to discover and advocate for the adoption of open access resources within institutions, there is potential to foster an educational environment that fully supports and values accessibility, collaboration, and innovation.

UCL Press will continue to develop its programme of OA textbooks and to keep up to date with our publications, please do sign up to our mailing list or take a look at our website.

Open Source Software Design for Academia

By Kirsty, on 27 August 2024

Guest post by Julie Fabre, PhD candidate in Systems Neuroscience at UCL. 

As a neuroscientist who has designed several open source software projects, I’ve experienced firsthand both the power and pitfalls of the process. Many researchers, myself included, have learned to code on the job, and there’s often a significant gap between writing functional code and designing robust software systems. This gap becomes especially apparent when developing tools for the scientific community, where reliability, usability, and maintainability are crucial.

My journey in open source software development has led to the creation of several tools that have gained traction in the neuroscience community. One such project is bombcell: a software designed to assess the quality of recorded neural units. This tool replaces what was once a laborious manual process and is now used in over 30 labs worldwide. Additionally, I’ve developed other smaller toolboxes for neuroscience:

These efforts were recognized last year when I received an honourable mention in the UCL Open Science and Scholarship Awards.

In this post, I’ll share insights gained from these experiences. I’ll cover, with some simplified examples from my toolboxes:

  1. Core design principles
  2. Open source best practices for academia

Disclaimer: I am not claiming to be an expert. Don’t view this as a definitive guide, but rather as a conversation starter.


Follow Julie’s lead: Whether you’re directly involved in open source software development or any other aspect of open science and scholarship, or if you simply know someone who has made important contributions, consider applying yourself or nominating a colleague for this year’s UCL Open Science and Scholarship Awards to gain recognition for outstanding work!


Part 1: Core Design Principles

As researchers, we often focus on getting our code to work, but good software design goes beyond just functionality. In order to maintain and build upon your software, following a few principles from the get go will elevate software from “it works” to “it’s a joy to use, maintain and contribute to”.

1. Complexity is the enemy

A primary goal of good software design is to reduce complexity. One effective way to simplify complex functions with many parameters is to use configuration objects. This approach not only reduces parameter clutter but also makes functions more flexible and maintainable. Additionally, breaking down large functions into smaller, more manageable pieces can significantly reduce overall complexity.

Example: Simplifying a data analysis function

For instance, in bombcell we run many different quality metrics, and each quality metric is associated with several other parameters. In the main function, instead of inputting all the different parameters independently:

[qMetric, unitType] = runAllQualityMetrics(plotDetails, plotGlobal, verbose, reExtractRaw, saveAsTSV, removeDuplicateSpikes, duplicateSpikeWindow_s, detrendWaveform, nRawSpikesToExtract, spikeWidth, computeSpatialDecay, probeType, waveformBaselineNoiseWindow, tauR_values, tauC, computeTimeChunks, deltaTimeChunks, presenceRatioBinSize, driftBinSize, ephys_sample_rate, nChannelsIsoDist, normalizeSpDecay, (... many many more parameters ...), rawData, savePath);

they are all stored in a ‘param’ object that is passed onto the function:

[qMetric, unitType] = runAllQualityMetrics(param, rawData, savePath);

This approach reduces parameter clutter and makes the function more flexible and maintainable.

 2. Design for change

Research software often needs to adapt to new hypotheses or methodologies. When writing a function, ask yourself “what additional functionalities might I need in the future?” and design your code accordingly. Implementing modular designs allows for easy modification and extension as research requirements evolve. Consider using dependency injection to make components more flexible and testable. This approach separates the creation of objects from their usage, making it easier to swap out implementations or add new features without affecting existing code.

Example: Modular design for a data processing pipeline

Instead of a monolithic script:

function runAllQualityMetrics(param, rawData, savePath)
% Hundreds of lines of code doing many different things
(...)
end

Create a modular pipeline that separates each quality metric into a different function:

function qMetric = runAllQualityMetrics(param, rawData, savePath)
nUnits = length(rawData);
for iUnit = 1:nUnits
% step 1: calculate percentage spikes missing
qMetric.percSpikesMissing(iUnit) = bc.qm.percSpikesMissing(param, rawData);
% step 2: calculate fraction refractory period violations
qMetric.fractionRPviolations(iUnit) = bc.qm.fractionRPviolations(param, rawData);
% step 3: calculate presence ratio
qMetric.presenceRatio(iUnit) = bc.qm.presenceRatio(param, rawData);
(...)
% step n: calculate distance metrics
qMetric.distanceMetric(iUnit) = bc.qm.getDistanceMetric(param, rawData);
end
bc.qm.saveQMetrics(qMetric, savePath)
end

This structure allows for easy modification of individual steps or addition of new steps without affecting the entire pipeline.

In addition, this structure allows us to define new parameters easily that can then modify the behavior of the subfunctions. For instance we can add different methods (such as adding the ‘gaussian’ option below) without changing how any of the functions are called!

param.percSpikesMissingMethod = 'gaussian';
qMetric.percSpikesMissing(iUnit) = bc.qm.percSpikesMissing(param, rawData);

and then, inside the function:

function percSpikesMissing = percSpikesMissing(param, rawData);
if param.percSpikesMissingMethod == 'gaussian'
(...)
else
(...)
end
end

3. Hide complexity

Expose only what’s necessary to use a module or function, hiding the complex implementation details. Use abstraction layers to separate interface from implementation, providing clear and concise public APIs while keeping complex logic private. This approach not only makes your software easier to use but also allows you to refactor and optimize internal implementations without affecting users of your code.

Example: Complex algorithm with a simple interface

For instance, in bombcell there are many parameters. When we run the main script that calls all quality metrics, we also want to ensure all parameters are present and are in a correct format.

function qMetric = runAllQualityMetrics(param, rawData, savePath)
% Complex input validation that is hidden to the user
param_complete = bc.qm.checkParameterFields(param);

% Core function that calcvulates all quality metrics
nUnits = length(rawData);

for iUnit = 1:nUnits
% steps 1 to n
(...)
end

end

Users of this function don’t need to know about the input validation or other complex calculations. They just need to provide input and options.

4. Write clear code

Clear code reduces the need for extensive documentation and makes your software more accessible to collaborators. Use descriptive and consistent variable names throughout your codebase. When dealing with specific quantities, consider adding units to variable names (e.g., ‘time_ms’ for milliseconds) to improve clarity. You can add comments to explain non-obvious logic and to add general outlines of the steps in your code. Following consistent coding style and formatting guidelines across your project also contributes to overall clarity.

Example: Improving clarity in a data processing function

Instead of an entirely mysterious function

function [ns, sr] = ns(st, t)
ns = numel(st);
sr = ns/t;

Add more descriptive variable and function names and add function headers:

function [nSpikes, spikeRate] = numberSpikes(theseSpikeTimes, totalTime_s)
% Count the number of spikes for the current unit
% ------
% Inputs
% ------
% theseSpikeTimes: [nSpikesforThisUnit × 1 double vector] of time in seconds of each of the unit's spikes.
% totalTime_s: [double] of the total recording time, in seconds.
% ------
% Outputs
% ------
% nSpikes: [double] number of spikes for current unit.
% spikeRate_s : [double] spiking rare for current unit, in seconds.
% ------
nSpikes = numel(theseSpikeTimes);
spikeRate_s = nSpikes/totalTime_s;
end

5. Design for testing

Incorporate testing into your design process from the beginning. This not only catches bugs early but also encourages modular, well-defined components.

Example: Testable design for a data analysis function

For the simple ‘numberSpikes’ function we define above, we can have a few tests to cover various scenarios and edge cases to ensure the function works correctly. For instance, we can test a normal case with a few spikes and an empty spike times input.

function testNormalCase(testCase)
theseSpikeTimes = [0.1, 0.2, 0.3, 0.4, 0.5]; totalTime_s = 1;
[nSpikes, spikeRate] = numberSpikes(theseSpikeTimes, totalTime_s);
verifyEqual(testCase, nSpikes, 5, 'Number of spikes should be 5');
verifyEqual(testCase, spikeRate, 5, 'Spike rate should be 5 Hz');
end

function testEmptySpikeTimes(testCase)
theseSpikeTimes = [];
totalTime_s = 1;
[nSpikes, spikeRate] = numberSpikes(theseSpikeTimes, totalTime_s);
verifyEqual(testCase, nSpikes, 0, 'Number of spikes should be 0 for empty input');
verifyEqual(testCase, spikeRate, 0, 'Spike rate should be 0 for empty input');
end

This design allows for easy unit testing of individual components of the analysis pipeline.

Part 2: Open Source Best Practices for Academia

While using version control and having a README, documentation, license, and contribution guidelines are essential, I have found that these practices have the most impact:

Example Scripts and Toy Data

I have found that the most useful thing you can provide with your software are example scripts, and even better, provide toy data that loads in your example script. Users can then quickly test your software and see how to use it on their own data — and are then more likely to adopt it. If possible, package the example scripts in Jupyter notebooks/MATLAB live scripts (or equivalent) demonstrating key use cases. In bombcell, we provide a small dataset (Bombcell Toy Data on GitHub) and a MATLAB live script that runs bombcell on this small toy dataset (Getting Started with Bombcell on GitHub). 

Issue-Driven Improvement

To manage user feedback effectively, enforce the use of an issue tracker (like GitHub Issues) for all communications. This approach ensures that other users can benefit from conversations and reduces repetitive work. When addressing questions or bugs, consider if there are ways to improve documentation or add safeguards to prevent similar issues in the future. This iterative process leads to more robust and intuitive software.

Citing

Make your software citable quickly. Before (or instead) of publishing, you can generate a citable DOI using software like Zenodo. Consider also publishing in the Journal of Open Source Software (JOSS) for light peer review. Clearly outline how users should cite your software in their publications to ensure proper recognition of your work.

Conclusion

These practices can help create popular, user-friendly, and robust academic software. Remember that good software design is an iterative process, and continuously seeking feedback and improving your codebase (and sometimes entirely rewriting/refactoring parts) will lead to more robust code.

To go deeper into principles of software design, I highly recommend reading “A Philosophy of Software Design” by John Ousterhout or “The Good Research Code Handbook” by Patrick J. Mineault.

Get involved! 

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Join our mailing list, and follow us on X, formerly Twitter and LinkedIn, to stay connected for updates, events, and opportunities.

 

 

 

UCL Open Science & Scholarship Awards – Update from Mike and Gesche!

By Kirsty, on 21 August 2024

As part of our work at the Office this year, we’ve made it a priority to stay connected with all of our award winners. Some of them shared their experiences during our conference, and we’re already well on our way to planning another exciting Awards ceremony for this year’s winners!

You can apply now for the UCL Open Science & Scholarship Awards 2024 to celebrate UCL students and staff who are advancing and promoting open science and scholarship. The awards are open to all UCL students, PhD candidates, professional services, and academic staff across all disciplines. There’s still time to submit your applications and nominations in all categories— the deadline is 1 September!

To give you some inspiration for what’s possible in open science, Mike Fell has given us an update on the work that he and Gesche have done since receiving their award last year:


In autumn last year, we were surprised and really happy to hear we’d received the first UCL Open Scholarship Awards. Even more so when we heard at the ceremony about the great projects that others at UCL are doing in this space.

The award was for work we’d done (together with PhD colleague Nicole Watson) to improve transparency, reproducibility, and quality (TReQ) or research in applied multidisciplinary areas like energy. This included producing videos, writing papers, and delivering teaching and related resources.

Of course, it’s nice for initiatives you’ve been involved in to be recognized. But even better have been some of the doors this recognition has helped to open. Shortly after getting the award, we were invited to write an opinion piece for PLOS Climate on the role of open science in addressing the climate crisis. We also engaged with leadership at the Center for Open Science.

More broadly – although it’s always hard to draw direct connections – we feel the award has had career benefits. Gesche was recently appointed Professor of Environment & Human Health at University of Exeter, and Director the European Centre for Environment and Human Health. As well as highlighting her work on open science, and the award, in her application, this now provides an opportunity to spread the work further beyond the bounds of UCL and our existing research projects.

There’s still a lot to do, however. While teaching about open science is now a standard part of the curriculum for graduate students in our UCL department (and Gesche planning this for the ECEHH too), we don’t have a sense that this is common in energy research, other applied research fields, and education more broadly. It’s still quite rare to see tools like pre-analysis plans, reporting guidelines, and even preprints employed in energy research.

A new research centre we are both involved in, the UKRI Energy Demand Research Centre, has been up and running for a year, and with lots of the setup stage now complete and staff in place, we hope to pick up a strand of work in this area. Gesche is the data champion for the Equity theme of that centre. The new focus must be on how to better socialize open research practices and make them more a part of the culture of doing energy research. We look forward to continuing to work with UCL Open Science in achieving that goal.

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Join our mailing list, and follow us on X, formerly Twitter and LinkedIn, to stay connected for updates, events, and opportunities.

 

 

 

Copyright and AI, Part 2: Perceived Challenges, Suggested Approaches and the Role of Copyright literacy

By Rafael, on 15 August 2024

Guest post by Christine Daoutis (UCL), Alex Fenlon (University of Birmingham) and Erica Levi (Coventry University).

This blog post is part of a collaborative series between the UCL Office for Open Science and Scholarship and the UCL Copyright team exploring important aspects of copyright and its implications for open research and scholarship. 

A grey square from which many colourful, wavy ribbons with segments in shades of white, blue, light green, orange and black radiate outward against a grey background.

An artist’s illustration of AI by Tim West. Photo by Google DeepMind from Pexels.

A previous post outlined copyright-related questions when creating GenAI materials—questions related to ownership, protection/originality, and infringement when using GenAI. The post discussed how answers to these questions are not straightforward, largely depend on what is at stake and for whom, and are constantly shaped by court cases as they develop.

What does this uncertainty mean for students, academics, and researchers who use GenAI and, crucially, for those in roles that support them? To what extent does GenAI create new challenges, and to what extent are these uncertainties inherent in working with copyright? How can we draw on existing expertise to support and educate on using GenAI, and what new skills do we need to develop?

In this post, we summarise a discussion we led as part of our workshop for library and research support professionals at the Research Libraries UK (RLUK) annual conference in March 2024. This year’s conference title was New Frontiers: The Expanding Scope of the Modern Research Library. Unsurprisingly, when considering the expanding scope of libraries in supporting research, GenAI is one of the first things that comes to mind.

Our 16 workshop participants came from various roles, research institutions, and backgrounds. What they had in common was an appetite to understand and support copyright in the new context of AI, and a collective body of expertise that, as we will see, is very useful when tackling copyright questions in a novel context. The workshop consisted of presentations and small group discussions built around the key themes outlined below.

Perceived Challenges and Opportunities
Does the research library community overall welcome GenAI? It is undoubtedly viewed as a way to make scholarship easier and faster, offering practical solutions—for example, supporting literature reviews or facilitating draft writing by non-English speakers. Beyond that, several participants see an opportunity to experiment, perhaps becoming less risk-averse, and welcome new tools that can make research more efficient in new and unpredictable ways.

However, concerns outweigh the perceived benefits. It was repeatedly mentioned that there is a need for more transparent, reliable, sustainable, and equitable tools before adopting them in research. Crucially, users need to ask themselves what exactly they are doing when using GenAI, their intention, what sources are being used, and how reliable the outputs are.

GenAI’s concerns over copyright were seen as an opportunity to place copyright literacy at the forefront. The need for new guidance is evident, particularly around the use of different tools with varying terms and conditions, and it is also perceived as an opportunity to revive and communicate existing copyright principles in a new light.

Suggested Solutions
One of the main aims of the workshop was to address challenges imposed by GenAI. Participants were very active in putting forward ideas but expressed concerns and frustration. For example, they questioned the feasibility of shaping policy and processes when the tools themselves constantly evolve, when there is very little transparency around the sources used, and when it is challenging to reach agreement even on essential concepts. Debates on whether ‘copying’ is taking place, whether an output is a derivative of a copyrighted work, and even whether an output is protected are bound to limit the guidance we develop.

Drawing from Existing Skills and Expertise
At the same time, it was acknowledged that copyright practitioners already have expertise, guidance, and educational resources relevant to questions about GenAI and copyright. While new guidance and training are necessary, the community can draw from a wealth of resources to tackle questions that arise while using GenAI. Information literacy principles should still apply to GenAI. Perhaps the copyright knowledge and support are already available; what is missing is a thorough understanding of the new technologies, their strengths, and limitations to apply existing knowledge to new scenarios. This is where the need for collaboration arises.

Working Together
To ensure that GenAI is used ethically and creatively, the community needs to work collaboratively—with providers, creators, and users of those tools. By sharing everyday practices, decisions, guidance, and processes will be informed and shaped. It is also important to acknowledge that the onus is not just on the copyright practitioners to understand the tools but also on the developers to make them transparent and reliable. Once the models become more transparent, it should be possible to support researchers better. This is even more crucial in supporting text and data mining (TDM) practices—critical in many research areas—to limit further restrictions following the implementation of AI models.

Magic Changes
With so much excitement around AI, we felt we should ask the group to identify the one magic change that would help remove most of the concerns. Interestingly, the consensus was that clarity around the sources and processes used by GenAI models is essential. How do the models come up with their answers and outputs? Is it possible to have clearer information about the sources’ provenance and the way the models are trained, and can this inform how authorship is established? And what criteria should be put in place to ensure the models are controlled and reliable?

This brings the matter back to the need for GenAI models to be regulated—a challenging but necessary magic change that would help us develop our processes and guidance with much more confidence.

Concluding Remarks
While the community of practitioners waits for decisions and regulations that will frame their approach, it is within their power to continue to support copyright literacy, referring to new and exciting GenAI cases. Not only do those add interest, but they also highlight an old truth about copyright, namely, that copyright-related decisions always come with a degree of uncertainty, risk, and awareness of conflicting interests.

About the authors 

Christine Daoutis is the UCL Copyright Support Officer at UCL. Christine provides support, advice and training on copyright as it applies to learning, teaching and research activities, with a focus on open science practices. Resources created by Christine include the UCL Copyright Essentials tutorial and the UCL Copyright and Your Teaching online tutorial.

Alex Fenlon is the Head of Copyright and Licensing within Libraries and Learning Resources at the University of Birmingham. Alex and his team provide advice and guidance on copyright matters, including text, data mining, and AI, to ensure that all law and practice are understood by all.

Erica Levi is the Digital repository and Copyright Lead at Coventry University. Erica has created various resources to increase awareness of copyright law and open access through gamification. Her resources are available on her website.

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Join our mailing list, and follow us on X, formerly Twitter and LinkedIn, to be part of the conversation and stay connected for updates, events, and opportunities.

 

 

 

Text and Data Mining (TDM) and Your Research: Copyright Implications and New Website Guidance

By Rafael, on 13 May 2024

This the second blog post of our collaborative series between the UCL Office for Open Science and Scholarship and the UCL Copyright team. Here, we continue our exploration of important aspects of copyright and its implications for open research and scholarship. In this instalment, we examine Text and Data Mining (TDM) and its impact on research along with the associated copyright considerations.

Data processing concept illustration

Image by storyset on Freepik.

The development of advanced computational tools and techniques for analysing large amounts of data has opened up new possibilities for researchers. Text and Data Mining (TDM) is a broad term referring to a range of ‘automated analytical techniques to analyse text and data for patterns, trends, and useful information’ (Intellectual Property Office definition). TDM has many applications in academic research across disciplines (Intellectual Property Office definition). TDM has many applications in academic research across disciplines.

In an academic context, the most common sources of data for TDM include journal articles, books, datasets, images, and websites. TDM involves accessing, analysing, and often reusing (parts of) these materials. As these materials are, by default, protected by copyright, there are limitations around what you can do as part of TDM. In the UK, you may rely on section 29A of the Copyright, Designs and Patents Act, a copyright exception for making copies for text and data analysis for non-commercial research. You must have lawful access to the materials (for example via a UCL subscription or via an open license). However, there are often technological barriers imposed by publishers preventing you from copying large amounts of materials for TDM purposes – measures that you must not try to circumvent. Understanding what you can do with copyright materials, what may be more problematic and where to get support if in doubt, should help you manage these barriers when you use TDM in your research.

The copyright support team works with e-resources, the Library Skills librarians, and the Office for Open Science and Scholarship to support the TDM activities of UCL staff and students. New guidance is available on the copyright website. TDM libguide and addresses questions that often arise during TDM, including:

  • Can you copy journal articles, books, images, and other materials? What conditions apply?
  • What do you need to consider when sharing the outcomes of a TDM analysis?
  • What do publishers and other suppliers of the TDM sources expect you to do?

To learn more about copyright (including how it applies to TDM):

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, LinkedIn, and join our mailing list to be part of the conversation!

 

 

Launching today: Open Science Case Studies

By Kirsty, on 29 April 2024

Announcement from Paul Ayris, Pro-Vice Provost, UCL Library, Culture, Collections and Open Science

A close up of old leather-bound books on a shelfHow can Open Science/Open Research support career progression and development? How does the adoption of Open Science/Open Research approaches benefit individuals in the course of their career?

The UCL Open Science Office, in conjunction with colleagues across UCL, has produced a series of Case Studies showing how UCL academics can use Open Science/Open Research approaches in their plans for career development, in applications for promotion and in appraisal documents.

In this way, Open Science/Open Research practice can become part of the Research Culture that UCL is developing.

The series of Case Studies covers each of the 8 pillars of Open Science/Open Research. They can be found on a new webpage: Open Science Case Studies 4 UCL.

It is only fair that academics should be rewarded for developing their skills and adopting best practice in research and in its equitable dissemination. The Case Studies show how this can be done, and each Case Study identifies a Key Message which UCL academics can use to shape their activities.

Examples of good practice are:

  • Publishing outputs as Open Access outputs
  • Sharing research data which is used as the building block of academic books and papers
  • Creating open source software which is then available for others to re-use and develop
  • Adopting practices allied to Reproducibility and Research Integrity
  • The responsible use of Bibliometrics
  • Public Engagement: Citizen Science and Co-Production as mechanisms to deliver results

Contact the UCL Open Science Office for further information at openscience@ucl.ac.uk.