X Close

Centre for Advanced Research Computing

Home

ARC is UCL's research, innovation and service centre for the tools, practices and systems that enable computational science and digital scholarship

Menu

Team (u)CLI finish 3rd!

By Samantha Ahern, on 9 December 2024

Team (u)CLI finish 3rd in the national Computing Insights UK student Cluster Challenge!

(u)CLI team members:Zak Morgan, Dept of Computer Science
Rozenn Raffaut, Dept of Med Phys & Biomedical Eng
Qi Li,  Dept of Med Phys & Biomedical Eng
Yuliang Huang, Dept of Med Phys & Biomedical Eng
Tom Bickley, Dept of Chemistry

Team (u)CLI

Team members from left to right:

  • Zak Morgan, Dept of Computer Science
  • Rozenn Raffaut, Dept of Med Phys & Biomedical Eng
  • Tom Bickley, Dept of Chemistry
  • Yuliang Huang, Dept of Med Phys & Biomedical Eng
  • Qi Li,  Dept of Med Phys & Biomedical Eng

 

 

 

 

The CIUK Cluster Challenge is a national inter-collegiate competition. This year there were 16 student teams taking part from across the UK.

The competition was formed of three challenges during October and November, before culminating in an additional three challenges at Computing Insights UK (CIUK) in Manchester in December.

The team performed well in the initial pre-conference challenges, and were at the top of the leaderboard heading into the onsite challenges.

They were supported throughout by the team mascot, Archie ‘ARC’vark.

Archie 'ARC'vark at CIUK

Team mascot: Archie ‘ARC’vark

Competition was fierce and overall team (u)CLI finished 3rd out of 15 teams who completed all challenges. Narrowly missing out on 2nd place by 2 points. Full results are available on the CIUK Cluster Challenge website.

Overall the team performed very well and received some very positive feedback from some of the challenge setters.

We look forward to supporting future teams in new challenges, and building a community of students who want to participate through our termly Cluster Club sessions.

 

 

 

The ARC TRE

By George Svarovsky, on 28 November 2024

The Advanced Research Computing Trusted Research Environment (ARC TRE) is available for UCL researchers. It can be found at https://tre.arc.ucl.ac.uk/.

What is the ARC TRE?

The ARC TRE provides compute environments, with desktops for each researcher, for projects working on sensitive data. You can think of each project space as a secured virtual room with computers and a network – and strict rules about who can bring data in and take data out. It’s a safe setting, where safe projects with safe people can work on safe data and produce safe outputs.

We’re now onboarding projects for which the environment is suitable, via the Information Governance assurance process. Suitability of the environment will expand through 2025, with certification for NHS data early in the year, and ISO27001 certification in mid-year; so please do consider the environment for projects in planning.

How is it different?

In designing a next-generation environment for research computing on sensitive data, ARC had some ambitious principles in mind.

One size does not fit all. The environment allows projects to choose the compute resources they need, and also the tools, and even their internal governance and fine-grained data permissions. Changes in one project do not affect any other project.

For science to be reproducible, its software setup must be too. The TRE provides projects with Container technology (Docker/Podman and Snap), a key enabler for making software setup repeatable. Not everyone will be familiar with this technology! – with this design decision, ARC is doubling down on its mission: to support researchers making best use of the tools, practices and systems that enable computational science and digital scholarship. Our Collaborations & Consultancy team are ready to help.

Unbounded research needs unbounded compute. ARC is investing in research compute, and these platforms should be just as available for use with sensitive data as for any other research. The TRE uses cloud practices to provision the required security, initially on a public cloud platform, AWS – giving virtually infinite capacity – and, in future, onto ARC’s research compute and storage platforms – giving access to high-performance compute at much lower cost.

Who can use it?

The ARC TRE is running under a new Information Security Management System (ISMS), which is in the process of being operationalised at ARC. This means the environment is not yet ready to take NHS data, or data that explicitly requires ISO27001 certification, but both certifications are planned for the first half of 2025. So, if your data does not require these certifications, or you are planning research for later in 2025, you can consider using the ARC TRE. In any case, the Information Governance assurance team or the TRE team will be happy to advise on the suitability of the environment for your research.

In terms of costs, the ARC TRE has a straightforward and permissive pricing structure:

  • For basic usage (a moderately powerful t3a.medium desktop per user, and 100GB of project storage), the ARC TRE is free.
  • If you require more than this, we will ask you to include some “facility” costs in your funding. We use a transparent model based on AWS pricing. As an example, requesting a p3.2xlarge (with a V100 GPU with 16GB VRAM) will cost about £140 a month.
  • We will not rescind access or charge you, even if you end up using more compute than originally estimated, or if the AWS price fluctuates. (In the case of accidental or unreasonable overuse, we’ll have a polite conversation!)

Early adopters of the ARC TRE might find they are talking to the team, and influencing the development of the environment, more than they expect! The project is run using Agile and DevOps, and we will continuously improve the platform (always taking great care to ensure projects in flight are not adversely affected).

What’s in it for ARC?

ARC is UCL’s research, innovation and service centre for the tools, practices and systems that enable computational science and digital scholarship. We expect concrete learning from and for each TRE project: benefiting both ARC and the research teams.

As ARC collaborates on different projects with their own configurations and procedures to solve their own unique challenges, we will be building up an armoury of configurations and procedures that can be used again. A configuration that worked well with one dataset will be invaluable for another project with a similar dataset. Data owners will find that we can meet their needs repeatedly and repeatably.

As we are expecting to continue to manage this service through many hundreds of projects, we will be able to take the knowledge learnt from one project into future projects, as well as working out how to gather and organize this information to improve continuously our information security and user experience.

Project Example: Machine-Learning Assisted Diagnosis

We are engaged with a research and innovation group with an ambition to provide a mobile phone-based solution for assisting the diagnosis of jaundice in babies, a prime indicator of liver disease later in life. This project will be supported by the TRE through its innovation journey from research to spin-out:

  1. Consented patient photographs are captured and ingressed into the TRE, and medical records ingressed from the hospital.
  2. Machine learning (ML) models can be developed in a TRE Project using the babies’ sensitive data.
  3. The ML models can continue to be refined in the TRE while and after the production system is built on the public AWS cloud, with new model versions being egressed and deployed.
  4. Learnings and components from the TRE Project can be shared and reused, including best practices, and hospital data pipeline integration.

What about the Data Safe Haven?

The current Data Safe Haven was designed over ten years ago. Its environment meets many research requirements, but it was not designed to support more modern data handling methodologies. The ARC TRE will provide a modern computing environment with the additional flexibility and power to support those UCL researchers for whom the Data Safe Haven is limiting, providing the same level of Information Governance assurance. Development is under way and is being carried out in an iterative fashion, with new features being rolled out over time.

UCL ARC and ISD are committed to supporting the Data Safe Haven until the ARC TRE is ready to take on all existing researchers. As new features are available in the TRE, research studies will be migrated in a controlled and supported manner. At this stage it is hard to put a timescale on this. Any changes will be fully supported with training and assistance.

What’s next?

As mentioned, the team is set up to continuously improve the ARC TRE. We have several capabilities that we know will be needed, and we will work with researchers to refine these requirements and to discover new ones.

Machine ingress. We’re already working on a secure API so that data can be fed into Projects without human interaction.

More shared resources. ARC and researchers can make contributions to the resources available to all ARC TRE projects, including common software tools and reference data. We expect to greatly expand the available resources over the coming months.

Inter-project sharing. We intend to allow projects with the requisite information governance to exchange data without it having to leave the environment.

Project templates. We strongly encourage projects in the ARC TRE to use software version control for their project materials. This is an essential tool for collaboration; and provides a history to aid transparency and recall of decisions. ARC will demonstrate how to re-use existing project setup repositories, which can become templates for multiple projects.

Archiving. The ARC TRE is not intended for long-term storage of research data; instead, it’s a compute environment for active research. However, many projects require archival of sensitive data. For now, researchers can leave their projects on the TRE, where the data is securely backed-up. We’re working with the ARC Data Stewards to ensure that project archival requirements can be met optimally in future.

Windows desktops. The ARC TRE provides Red Hat Enterprise Linux desktops. Our initial user feedback has been (slightly surprisingly) positive about this decision – but we know that many projects in the Data Safe Haven make use of Windows tools. A managed Windows environment for sensitive data will continue to exist, for long as it’s needed – design of the long-term solution is ongoing.

Thanks

A great many people have been involved in this project, at ARC, ISD, ISG and beyond, and we are grateful to one and all.

The core technical team of Tom Young (lead), Robert Vickerstaff, and Sunny Park; enabled by delivery manager Rupert Roker. Architecture and security management system led by Tim Machin, ably supported by Finley Bacon who also took on the service management; and the ARC information governance team Martin Donnelly, Preeti Matharu and Victor Olago. Ongoing and valued expertise from Jack Hindley, Dan Bretherton, Trevor Peacock, Anthony Peacock (no relation), David Solomon, Werner Niekirk, Ben Thomas, and others who have been pestered about one thing or another since January. User representation from Nick Owen, Socrates Varakliotis, John Watts, Michelle Harricharan, and many more. And the leadership that sponsored and fought for the project: Donna Swan, Rachel Cooper, and James Hetherington.

Want to know more?

The environment is visible to anyone with a UCL login, at https://tre.arc.ucl.ac.uk/

Information on Sensitive Data and Trusted Research Environments

Homepage of the Trusted Research Environment Assurance process

George Svarovsky is a Principal Research Software Engineer working in UCL’s Centre for Advanced Research Computing. George is the Product Owner for the Trusted Research Environment, working with the service’s users to understand their needs and ensure that the service his team builds meets those needs. If you would like to get in touch with George to talk about how the TRE can best support your research then please contact the team.

UCL ARC at the 2024 International RSE Conference

By Jonathan Cooper, on 18 November 2024

This year’s RSE conference took place from 3rd – 5th September in Newcastle. Around 20 ARC staff attended in person, with several others joining in the hybrid experience remotely. The latter has been steadily improving year on year, at least for contributing to the formal parts of the conference programme. The informal conversations are still much better in person, and this aspect was particularly appreciated by our newer RSEs. This blog post summarises our joint impressions of the conference, based on a debriefing discussion we had in our weekly “Collaboration Hour” later in September, and edited by Jonathan Cooper. We have raced against the conference committee to get this blog post out before the conference materials are published!

The programme had a mix of technical topics and sessions devoted to how we work as RSEs, and indeed wider “Research Technology Professionals” (RTPs) as well. RTP is a newly coined phrase used by UKRI among others to refer to the wide range of specialist roles that exist alongside traditional researchers, encompassing all the ARC professions and more. Several initiatives are aiming to leverage the success of the RSE movement to advance other professions in a similar way, some of which we are involved in or proposed at the conference, notably in the RSE leaders and aspiring leaders satellite event on the Monday. This has grown massively since its inception as a safe space for those struggling to create RSE groups to share the pain and learn from each other! Now there are many kinds of RSE group, many individuals in different RTP leadership roles, and much more wide-ranging discussions as a result. I particularly appreciated the session on the skills that leaders need in this environment – what people have found helpful and how we should be growing the next generation.

A similar topic was covered in the RSE Competencies workshop, although this covered all areas of RSE skills and tried to categorise these. We ended up with more non-technical skills than those focused on specific technologies. The work is ongoing with monthly community meetings, aiming to build a toolkit that will help people advance their careers: identify skills they need and avenues for training and professional development.

Several sessions focused on project management and Agile approaches. We heard from Manchester how they are adapting Scrum for their research projects, notably the categorisation of projects that they have according to how large they are and how engaged the researchers are, and therefore the different sort of tweaks they’re made to Scrum in each case. These seem to be fairly similar to how we operate in ARC, but in a more formalised structure. We contributed to the discussion session they ran on the following day (led by Sarah Jaffa, formerly of ARC!) with Monika and I doing a double act presenting a high level view of our approach and a summary of Kanban. An important theme of that session was that project management is as much about self-care as it is delivering on the goals of the project, and these aspects need to be well balanced or both suffer. A special interest group (SIG) is being set up dedicated to project management, and we have continued discussions within ARC too, with a recent blog post on adapting the agile values and principles to a research context.

Others in the group focused more on the technical programme. Mutation testing was one highlight – described as sort of like test coverage, but your code is randomly changed to see what breaks. If no tests fail then you may have revealed an untested code path that needs to be tested and fixed. It’s good for catching edge cases that haven’t been thought of but does take time to run. We noted that this is good as part of a wider array of testing approaches that can be used, for example hypothesis testing (randomising code inputs rather than the code itself).

Best practices for setting up development environments were covered in a couple of talks, and how this is perhaps one aspect that distinguishes an RSE from a CS researcher. These range from use of pip and pipenv in Python to things like dev containers and Nix. These are important for reproducibility. The Netherlands eScience Centre python project template had a nice feature that allows updating projects created using a template when updates to the template itself are made.

Several talks looked at performant Python. We were surprised (perhaps unfairly) at how much impact simply upgrading to the latest Python version can have. Tools like numba and approaches such as vectorisation were well known, but tips for using list comprehensions, sometimes in preference even to Pandas apply operations, were appreciated and will be useful for several of our projects.

As you might expect from ARC we had significant involvement in the high-performance computing sessions, including Tuomas running a “birds of a feather” (BOF) event for the HPC RSE community and giving several talks. Talks not by us covered a range of topics, including the age-old comparison of the merits of different languages and porting between them, the newer frameworks aiming to ease GPU programming, portability between different hardware, and debugging parallel programs. We enjoyed trying out the Grace Hopper chips in IsambardAI, and discussing how to utilise HPC in the most environmentally sustainable way. The conclusion from Archer2 is that given the CO2 released by manufacturing HPC systems, the best option is to run them as intensively as possible since this maximises the research done for a given carbon cost – and indeed that personal lifestyle changes may be a better option for minimising your impact!

Green RSE was the focus for another BOF which some ARC staff attended. A SIG is being set up for this, trying to raise awareness of what RSEs can do and consider what training might be helpful. This is something we want to get involved with more at ARC, starting with an inventory of our current state in conjunction with the department’s Green Team.

The Fortran satellite event was very well run. It revealed that many people want similar improvements to the Fortran ecosystem to support automated testing and the like. We have recently started an initiative along those lines at ARC so will be trying to work with the wider community on this and avoid duplication of effort, having now met some relevant people.

Some talks focused more on particular research domains. Given ARC’s current efforts developing Trusted Research Environments (TREs) we were interested by the Turing’s approach. They worry less about packages coming into the secure environment, and advocated for just proxying CRAN, PyPI and the like, while making sure that your infrastructure is set up securely enough that things can’t get out unless you want them to. So if something does get in and cause havoc, it shouldn’t be able to egress any data and it should only affect a single study or project. This is also the approach we take in ARC’s TRE.

The unconference session was a highlight for some, particularly discussion of developing software as a medical device. This covered trying to work with people in the institution to come up with processes, but also trying to figure out what the role of RSEs and the Society of RSE should be in that process. Are we just followers or the ones coming up with the process? How much of the regulatory side is our responsibility and who else we need to work with? No firm answers from that, but lots of questions! A prototype tool we developed for one project may be useful here if we can get funding and/or collaborators to continue the work.

All found the keynote talks inspiring, especially the one from Anne-Marie Imafidon. The test driven development workshop got a mention as being really clearly presented with great materials, as did an interesting C++ graphics library called Morphologica.

We are all looking forward to next year’s conference in Warwick. On a personal level, I’m especially pleased that it is likely not to conflict with INSET days and I’ll be able to be there in person again.

Alignment of the Agile Manifesto to a Research Context

By Monika Byrne Svata, on 14 November 2024

This article proposes aligning the language of the original Agile Manifesto – written over 20 years ago, for software development in a commercial setting – with our current context of digital research projects involving research software engineering, data science, data stewardship, and research infrastructure development.

This work was inspired by discussion about the wording of the Agile Manifesto during the regular Agile Training for Research projects that we run in UCL’s Advanced Research Computing Centre (ARC) for senior staff in our Collaborations team. To gain wider input from colleagues we devoted two ARC “Collaboration Hour” sessions to this topic, with additional conversation held on Slack, some email input, and a period where a draft of this article was available for internal comment.

We hold that the core ideas behind Agile, such as responding to change, valuing people interactions, etc., are valid and beneficial in a research context. However, the specific expression of these may be able to be improved on – in true Agile fashion! Our aim is that this will make it easier to apply the Agile principles in the management of our Collaborations projects, by removing the cognitive dissonance caused by the language inspired by a different context. By publishing this article, we hope that others will see a similar benefit, and we invite feedback from the community.

 

The Original Agile Manifesto

It originated in February 2001 in a meeting of representatives from emerging ‘lightweight’ software development approaches in response to the need for an alternative to documentation driven, heavyweight software development processes.

Although there are many frameworks to aid the application of Agile approaches for particular settings, the manifesto emphasises that the change of culture within organisations and teams is the key element and the condition of the success of implementing Agile ways of working.

“While the Manifesto provides some specific ideas… there is a deeper theme of values based on trust and respect for each other and promoting organizational models based on people, collaboration, and building the types of organizational communities in which we would want to work.”

“So, in the final analysis, the meteoric rise of interest in – and sometimes tremendous criticism of – Agile Methodologies is about the mushy stuff of values and culture.”

For a fuller history, visit the Agile Manifesto website.

The original Agile Manifesto contains 4 Agile Values, and 12 Agile Principles.

Below we give the original text of each alongside our updated version and discuss the reasons for our proposed revisions.

 

 

Key Terms

Although the wording of each of the values and principles has been considered separately, to make sure that it reflects the best of both the original meaning and its application to research/academia, we found it useful to give an initial consideration and space for discussion to some of the repeated key terms and the reality of research projects.

 

Original wording Discussion about new wording
Customer ‘Customer’ implies negotiation and a zero-sum game, rather than a collaboration with a common goal. This also applies to the term ‘Client’.

‘End user’ is a specific term that might not correctly reflect the reality of a research project or correctly describe the collaborators.

‘Collaborators’, ‘our collaborators’, ‘all collaborators’ feel like the terms best describing this role.

Valuable software / Working Software Terms like ‘valuable software’, ‘working software’ or ‘digital artefacts’ are too limiting, as the outputs of collaborations projects are often other than software (e.g. research, teaching/education, service, etc.)

The suggested terms that felt acceptable included ‘desired outcome’, ‘academic output’, ‘the research’, ‘research outcome’.

Developers Research Technical Professionals – RTP
Business people Depending on the context of the individual principles, terms like ‘researchers’ or ‘domain experts’ felt appropriate.
Major current areas of pain for research projects The original context of the Agile Manifesto, expressed in the Agile Values, was that it was responding to the reality of rigid overplanning and over-documenting, where any change, learning, or other deviation from the original assumptions was seen as disruptive and a risk.

As the reality of research projects in 2024 carries different issues and risks, we wanted to keep these in mind, so that the values address these.

Some of the pain points of research projects highlighted in the discussion:

  • Insufficient documentation (leaving ‘breadcrumbs’ behind)
  • The scale and ambiguity of the research outcomes
  • Parallel working on multiple projects
  • Limited longevity of the projects and teams due to grant work

 

 

Revised Agile Values

Below is the original wording of Agile Values followed by the new wording that is the result of ARC-wide discussion, and in our view best represents their application to research projects.

 

Original:

  • Individuals and interactions over processes and tools.
  • Working software over comprehensive documentation.
  • Customer collaboration over contract negotiation.
  • Responding to change over following a plan. 

That is, while there is value in the items on the right, we value the items on the left more.

 

Agile Values for Research Projects:

In these statements, while there is value in the items on the right, we value the items on the left more.

  • Individuals and interactions supported by suitable processes and tools.
  • Working solutions supported by adequate documentation.
  • Collaboratively responding to change supported by agile planning.

 

Discussion Points:

  • To highlight the importance of all elements of delivery (including documentation, tools, processes, planning etc.), we agreed to move the sentence stressing this point to the start. For the same reason, we changed the word ‘over’ for ‘supported by’.
  • To denote that the processes and tools are in service of the main outcome, we added the word ‘suitable’.
  • The term ‘comprehensive’ documentation has been updated to ‘adequate’ documentation to reflect that the detail, format, and amount of documentation needs to be fit for purpose rather than a goal or outcome in its own right.
  • ‘Contract negotiation’ in research is different than in a business setting, being typically less adversarial and restricted to agreement with funders. The concept as evoked in the original values applies more to the process of requirements elicitation and jointly planning for the project delivery, so we agreed to merge the values related to contracts and to planning, with the overarching theme of collaborative work. This is to stress that the nature of scoping, planning and delivery of research projects is collaborative and evolving, rather than a fixed result of prior negotiations.

 

Revised Agile Principles

 

For each principle we set out the original wording followed by the new wording that is the result of the ARC-wide discussion and best represents their application to research projects.

Included are also some of the main discussion points to clarify the thought process that went into the updated wording.

 

Principle 1

 

Original:

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.

 

New Wording for Research Projects:

Our highest priority is early and continuous delivery of valuable outputs through meaningful collaboration.

 

Discussion Points:

  • How to define ‘customer’. Suggestions included ‘domain experts and users’, ‘the world, ‘collaborators’, and ‘researchers’. In the end, we agreed that highlighting a ‘customer’ in this principle is unnecessary, as the purpose of the collaboration project is not aimed only at one of the parties (regardless their name).
  • The word ‘satisfy’ implies that our contribution is to deliver someone else’s. requirements as opposed to actively collaborate on research as equal partners.
  • The words ‘early’ and ‘continuous’ carry the key point of this principle, therefore we made sure they are included in the new version.
  • The output of our projects is not necessarily ‘valuable software’ but it might be research, training, software, digital solutions or data management to enable research, or a combination of the above.

 

 

Principle 2

 

Original:

Welcome changing requirements, even late in development.  Agile processes harness change for the customer’s competitive advantage.

 

New Wording for Research Projects:

Welcome changing requirements, even late in development. Agile processes harness change for the benefit of the collaborative research outcome.

 

Discussion Points:

  • ‘Customer’ and ‘competitive advantage’ do not apply well to research projects, and it is important to define what it is we are trying to maximise.
  • The words ‘welcome changing requirement’ and ‘even late in the development’ are key in this message and we made sure they make it to the latest version. It is understood that this doesn’t mean indiscriminate implementation of any change (early or late); rather it means the ability to assess changes and deal with them appropriately being an expected part of the process.

 

 

Principle 3

 

Original:

Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.

 

New Wording for Research Projects:

Deliver meaningful outputs frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.

 

Discussion Points:

  • The ‘working software’ is not the only possible output – as discussed before. The considered options for this principle were ‘research outcome’ or ‘output’.
  • The term ‘research outcome’ was found to be closer in meaning to the end result of the project, whereas ‘output’ can be a partial deliverable or result of any kind (software functionality, bug fix, result or a partial achievement of a particular research question, documentation of rules/requirements, update of data etc.). As the point of this principle is to stress frequent delivery of interim outputs, the term ‘outputs’ was found best suited.
  • ‘Meaningful’ output has been added to denote the principle of producing an output that is not only a part of the final deliverables (e.g. final documentation – valuable as it is) but crucially steers the project towards better understanding of the requirements or solution and achieving its main goals.

 

 

Principle 4

 

Original:

Businesspeople and developers must work together daily throughout the project.

 

New Wording for Research Projects:

Domain experts and research technology professionals aim to work together daily throughout the project.

 

Discussion Points:

  • ‘Businesspeople’ in research mean anyone who is bringing the knowledge of the research domain that we are collaborating on. This can be researchers, post-docs, user representatives (e.g. in cases of co-creation), business representatives (in cases of collaboration with industry), etc. ‘Domain experts’ was agreed to cover all these possibilities.
  • ‘Research Technology Professionals’ covers all professions within ARC (Research Software Engineers, Research Data Scientists, Research Data Stewards, Research Infrastructure Developers, PRISMs) and is a term used by UKRI.
  • Although the RTPs are also domain experts in their own right, the point of this principle is that the technical aspects of the project should be worked on in very close collaboration with the non-technical experts. Therefore, we kept the demarcation of the technical and non-technical experts for this principle, rather than covering them by the term ‘collaborators’ as we do in some of the other principles.
  • The ambition of working together ‘daily’ has been challenged in this discussion as it is a very challenging requirement that is rarely practicable. However, as the principles denote the recommended ideal (e.g. team members on full time on a single project, the product owner with good availability and direct accountability), it is very useful and important to have this principle stated in its ideal undiluted form. For the cases where compromises need to be found (e.g. team members on part time, low availability of collaborators etc.), it is useful to understand the reasons for these compromises and what are the most reasonable adjustments.
  • Due to the ambitious nature of ‘daily’ communication and collaboration, the word ‘must’ was viewed as too strong and was rephrased as an aim.

 

 

Principle 5

 

Original:

Build projects around motivated individuals. Give them the environment and support they need and trust them to get the job done.

 

Wording for Research Projects – No Change:

Build projects around motivated individuals. Give them the environment and support they need and trust them to get the job done.

 

Discussion Points:

  • There was no challenge about the content or wording of this principle. Arguably, in the context of research, this principle is not only as relevant as in commercial setting, but also closer to the ethos of the individuals and teams working in this environment compared to business.

 

 

Principle 6

 

Original:

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

 

New Wording for Research Projects:

The most efficient and effective method of communication in a research team is synchronous conversation.

 

Discussion Points:

  • In our current working environment, it is not reasonable to assume that teams are physically collocated, therefore physical face-to-face conversation is frequently not feasible. Synchronous conversation (such as Teams call, Slack huddle or similar) is the next best option.
  • Synchronous conversation is not everyone’s preferred method of communication and there are situations where conveying information might be better suited to other media. However, when it comes to communication, synchronous communication enables richer and more nuanced information exchange in faster and more efficient ways than asynchronous communication and therefore is essential to establish as a regular communication channel for a team.
  • In the original principles, the ‘development’ team might imply mainly the involvement of technical professionals. However, as the outcome of research projects is often research rather than software, this principle applies to all members of the team, including the domain experts.

 

 

Principle 7

 

Original:

Working software is the primary measure of progress

 

New Wording for Research Projects:

Research outputs are the primary measure of progress.

 

Discussion Points:

  • The definition of the key output of the research projects that can be produced regularly and in the interim before the end of the project. We chose ‘research outputs’ in favour of ‘research outcome’, as the ‘research outcome’ is often reached only at the end of the project.

 

 

Principle 8

 

Original:

Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.

 

New Wording for Research Projects:

Agile processes promote development at a sustainable pace for the whole team, without having to increase intensity to meet deadlines.

 

Discussion Points:

  • It is unreasonable and unnecessary to expect ‘indefinite’ delivery. In contrast to a commercial setting, the duration of collaboration within a research team is often limited by grants and therefore specifying ‘any duration’ is equally unnecessary. The key message of this principle is that the ways of working should be ‘sustainable’ to all members of the team, while it lasts.
  • As the term ‘sustainable’ is often associated with environmental impact, which is not the point of this principle, we have added ‘pace’ to the original wording for clarity.

 

 

Principle 9

 

Original:

Continuous attention to technical excellence and good design enhances agility.

 

Wording for Research Projects – No Change:

Continuous attention to technical excellence and good design enhances agility.

 

Discussion Points:

  • There is no obvious challenge in translating this principle from a commercial to a research setting and there were no other suggestions raised in the discussion.
  • The frequent discussion in relation to this principle is how it relates to the previous principles of embracing change and producing outputs early – but this is related to adoption of agile ways of working themselves, rather than their adaptation to research/academia.

 

 

Principle 10

 

Original:

Simplicity – the art of maximizing the amount of work not done – is essential.

 

Wording for Research Projects – No Change:

Simplicity – the art of maximizing the amount of work not done – is essential.

 

Discussion Points:

  • As the wording ‘maximising the amount of work not done’ is purposely bold and provocative, it sparked a discussion as to whether this statement is encouraging not putting sufficient effort into the project. However, it has been agreed by majority that it is clear that this statement is encouraging prioritisation and efficiency rather than avoiding doing work that is legitimate and important (whatever the nature of that work might be, including documentation, refactoring, search for efficient solutions, etc.).

 

 

Principle 11

 

Original:

The best architectures, requirements, and designs emerge from self-organizing teams.

 

Wording for Research Projects – No Change:

The best architectures, requirements, and designs emerge from self-organizing teams.

 

Discussion Points:

  • The principle feels congruent with our understanding of the best ways of working for research projects and didn’t raise any challenge in the discussion.
  • Although not brought up in the discussion, one of the relevant points might be a discussion how to include the role of ARC Project Manager to the construct of the flat Agile team (especially for Scrum).

 

 

Principle 12

 

Original:

At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behaviour accordingly.

 

Wording for Research Projects – No Change:

At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behaviour accordingly.

 

Discussion Points:

  • There was general agreement and no challenge to this principle in the discussion. Everyone in the team is familiar with team or sprint retrospectives and broadly in agreement about their usefulness.
  • The challenge in this space might be in details of the practice of retrospectives (or similar techniques) – their frequency, who runs this meeting, who attends the meeting – to make sure that it brings the intended benefits, and in the ways the learnings are actively fed back into the working practices of the team.

 

 

 

 

 

 

 

Team (u)CLI are Go!

By Samantha Ahern, on 22 October 2024

ARC are excited to announce this year’s team for the CIUK Cluster Challenge – (u)CLI!

Group photo of team members of (u)CLI

Team (u)CLI

  • Zak Morgan, Dept of Computer Science
  • Rozenn Raffaut, Dept of Med Phys & Biomedical Eng
  • Qi Li,  Dept of Med Phys & Biomedical Eng
  • Yuliang Huang, Dept of Med Phys & Biomedical Eng
  • Tom Bickley, Dept of Chemistry
  • Shahid Khalid, Neonatology

 

 

The CIUK Cluster Challenge is a national inter-collegiate competition. This year there are 16 student teams taking part from across the UK.

The competition is formed of four challenges during October and November, before culminating in an additional four challenges at Computing Insights UK (CIUK) in Manchester in December.

The competition is an opportunity for students to develop technical skills in high performance compute, network with compute professionals and develop a range of employability skills.

Last week, the team competed in the 1st challenge, set by Alces Flight, and finished 2nd. Well done team (u)CLI!

We wish them well for their future challenges and supporting them at CIUK in Manchester.

Could you be a future Cluster Challenge team member? Start your HPC journey by joining the Cluster Club.

 

 

Introducing: High Level Designs!

By Amanda Ho-Lyn, on 23 September 2024

What?

High level designs give an overall idea to how a project and its components are structured, typically with the aid of a diagram. If you’ve worked with AWS or the CDI, you’ve probably seen some architecture diagrams outlining the flow of the app with various services – this is the sort of idea we’re talking about.

Why?

Following our previous survey surrounding how knowledge is currently shared within the department, we found that many of us care about high level designs of projects but we don’t actually know what they are for most of them unless we’re working on them.

As such, we decided to add some guidance around how to add this to the ops tracker as well as some examples of how you might want to structure your design info in the issue and the repo. We’re hoping that this is a light and simple change that people are happy to adopt since it should help us understand projects better at a high level – hopefully reducing the number of times we jump onto a project only to be overwhelmed and confused by how things are set up and trying to blindly feel our way around in the hopes of understanding.

That being said, the main burden will fall to project managers who will be tasked with making sure the designs are [still] relevant each TI and if they don’t exist, maybe it might be time to add them.

This will also result in a small update to the issue template – an aspect of the ops tracker and a key component of how we share knowledge within ARC – which we’d like your feedback and thoughts on in this survey if you’ve not yet completed it.

 

ARC at UN’s OSPOs for Good and What’s next for OSS?

By David Pérez-Suárez, on 26 July 2024

View to the centre of the UN ECOSOC room where the symposium happen from a delegate perspective. On the foreground, a small display shows the name of the conference together with the microphone and translation earphone. The background shows attendees taking their seats. At the end of the room there's a screen showing a live feed from the room in a background of orange and white curtain is stopping the sunlight getting into the room.

OSPOs for Good – A symposium at the UN.

I, together with Malvika Sharan and Arielle Bennett from the Alan Turing Institute / The Turing Way and Neil Chue Hong from Software Sustainable Institute, attended at the beginning of July OSPOs for Good symposium at the United Nations Headquarters, representing research institutions from the UK.

What’s an OSPO?

An Open Source Programme Office (OSPO) is an entity within an organisation that looks after their Open Source strategy and operation. OSPOs have been set in the industry for many years (and under many different names), but it’s only till recent years when academic and research institutions as well as government and world organisations have also started to create them. Certainly, the goals of all these different OSPOs differ, but by having a common name helps to develop a fabric across them to promote collaboration.

OSPOs for good?

At the symposium, OSPOs from different entities – national and local level government, institutions and private sector – together with open source communities, think tanks, and activists, joined together to discuss how Open Source and through those varied programme offices can be used to help tackle the UN’s Sustainable Development Goals (SDGs). Ideas ranged from those that can run within specific communities, companies, or institutions to collaborations across all of them. 

For over two years now, the UN has been running an OSPO, and this event is part of their commitment to working with the whole world. The symposium gave details about how OSPOs were formed and their motivations, as well as advertise initiatives from those attending. For example, the OSPO at the UN started by a survey to all its sub-organisations to know their positions on Open Source Software, whether that’s from a user, creator, maintainer, community member or stakeholder point of view. They have grown fast and highlighted that Open Source is more than Open Source Software! Open Source covers open communities, knowledge, data, science, …. On the digital front, these were referred to as Digital Public Goods (DPGs), which I think conveys more weight than open source software or open data. A person may not understand what open source software is, but a public good resonates that it has something good for them!

On open source communities, however, many of us have the tendency to focus on the software and forget to look beyond it. How can we go an extra step and check how what we are creating can be relevant to the SDGs? That may look too hard (and even ambitious?), if that’s the case, we should have a commitment to no harm – I firstly heard it from Malvika’s talk: Applying Do No Harm principle to enhance open source sustainability at SOOCon24. That’s where many of us need to start with, reviewing our practices and applying do-no-harm principles within our communities to work towards Open Source for good.

Though many people see the goods of Open Source for the world, not everyone at is an open source advocate! Even within the UN, they’ve got challenges and myths against Open Source that are common in many other communities. Some of these are based on trusting more big tech companies or using privacy or security concerns as an excuse to prevent opening their source (whether it’s data, code or knowledge). Regarding security, however, a recent event that caused a worldwide mayhem demonstrates that vulnerabilities are not exclusively on open source software.

From understanding how UN’s OSPO works within a multi-organisation level, we moved into a state/local government level. In Germany, they are working to unite much of the different wheel-reinventions at different states into a more collaborative approach where the tools and processes are shared (such as platforms like openCode). This has not only stopped there, but they are collaborating with France too! There, the BlueHats initiative helps to promote and recognise the community behind Open Source development within government. Funding open source is a big problem, and that was discussed too – people from Sovereign TechFund and ZenDiS gave an overview of the current successful model that’s being used in Germany.

Initiatives within foundations and cities were also discussed, as well as the opportunities universities can have collaborating with their local communities on that.

One of the UN’s OSPO star activities mentioned and praised was Reboot the Earth. Five hackathons that happened around the globe to come up with ideas that can advance the SDGs. The winners from each of them presented their ideas to us, and now they are taking part in a coaching programme to scale their solutions. This was very inspiring!

Once the symposium finished, I, together with many of the attendees, participated in a 1-day workshop: What’s next for Open Source? There we had more time to interact and collaborate between us and come up with various lists of actions to keep pushing forward the idea of Open Source for good. At least, one thing was easily done: we made connections between various organisations that will convert into future collaborations.

What’s next for UCL?

Well, glad you asked! At ARC we are starting to build an OSPO to provide Open Source guidance and support to research, education, and e-infrastructure projects through the whole university. At the moment we are scoping our vision and midterm goals, but hopefully soon we can advertise our plans to the rest of the university and the whole world. Would you like to be involved? Write to us at open.source@ucl.ac.uk.

Where can I read more about the event?

You can check the event highlights, the event recordings and reports from other attendees such as Nicole Martinelli at OpenSource.net, Steven J. Vaughan-Nichols at The New Stack and at The Register, Nithya Ruff at the Linux Foundation, and Henriette Litta at the Open Knowledge Foundation,

RSE Initiatives – 6 months in

By Amanda Ho-Lyn, on 7 June 2024

What?

At ARC I think it would be fair to say we strive to develop and improve not only on an individual level, but also on a group level. One of the ways we are doing this is through our RSE (Research Software Engineer) Initiatives – aiming to advance/evolve the RSE team to improve collaboration and delivery of the best possible software. They involve taking a more objective look at the current processes within our department and determining, by consensus, whether some of these processes need to be updated, or if a new solution should be devised. These are not overnight quick-fixes but rather, slow & steady progressions in the right direction.

We’ve focussed on 3 main areas: Professional DevelopmentGood Practices and Knowledge Sharing.

As we’ve recently reached the 6 month mark of embarking on this journey, I thought I’d share an overview of each initiative’s aim and how we’re doing.


Professional Development

Notable people: Connor Aird, Stef Piatek & soon to be Paul Smith

This is about understanding how we currently decide to upskill (soft and technical) ourselves, what opportunities there are and how we can enable and support more/better opportunities.

The way we decided to figure out what people are doing regarding their professional (and to some degree personal) development was by interviewing them.

At the time of writing almost all the interviews have been completed and data gathered, being prepared for analysis.

Good Practices

Notable people: Haroon Chughtai, Kimberly Meechan & Emily Dubrovska

This looks at how much we engage with establishing and following best practices with technologies, languages and tools. We also want to determine whether there are areas where we could formalise/document this for future RSEs – a notable example is within the Python Tooling Community.

We decided it would be worth modelling the approaches of the Python Tooling Community and seeing whether there are other language/technology communities within ARC that don’t have best practice guidance but would benefit from it. This was done through a survey.

At the time of writing, the next groups of interest are Web Development and DevOps – both in the stages of requirements gathering/gaining an idea of what guidance could be documented or be built on, as well as looking into how it could best be delivered. 

Knowledge Sharing

Notable people: George Svarovsky & Amanda Ho-Lyn

This is about understanding how we currently share knowledge across the group – particularly project information – and how we can improve our current systems to be more usable and make information more accessible.

We decided to do a survey to see how people felt about how information is currently shared and also how much they actually felt they knew about different aspects. There were also some mentions of discontent about where information was posted and shared across a plethora of platforms.

At the time of writing, we have added a mini landing page to the ARC GitHub (note that you must be part of the org to see it) in an attempt to centralise relevant links to various places – this is a living thing and can be updated as necessary. We have also sent out a survey (thank you to those who took the time to complete it) and have plans to act on the results – see my post with more details about this (coming soon).

 

Thanks to everyone who’s been a part of this and continues to help us improve – especially to Asif who is forging the way ahead. And keep an eye out for more surveys! 😁

 

Creating Custom Font Symbols

By Amanda Ho-Lyn, on 5 June 2024

Why?

There are various reasons why researchers may want to include custom symbols to a font; they may want to represent a new concept or add characters from historical sources that are not available in a font yet. Being able to do so makes their work more accessible to colleagues and the public. In the cases of ancient languages, there often aren’t fonts that support these glyphs as they are still being researched and aren’t widely used outside of their respective communities. There are also arguments to be made for using custom glyphs in art and design.

How?

Figure out your glyph codes

With the Unicode standard we have the capability to facilitate this through the private use areas (PUAs) wherein we can assign codes to the custom glyphs. This provides the glyphs with a universal ID allowing them to be packaged and used more widely – this can be especially important when dealing with more than one font which may make use of the PUAs as we want to limit clashing of codes and will allow the font to be usable across different projects that may have use for the custom glyphs.

If you have an existing font and want to check if it uses some PUA codes, open the font in a charset checker and specify the Private Use Area block (you can also check for other things). If you’re free and clear, then you have no restrictions when it comes to picking the codes within the PUA block.

Draw your glyphs

You’ll need to design your glyphs with a specialist tool. Since we’re working with custom Unicode ranges in this case, that’s a specific bit of functionality to look for. I used glyphr studio which was a free online tool and it worked really well. It’s a good idea to know how many glyphs you’ll need to draw as you have to specify a range, though it can be updated. If you can work from references that will make this process much simpler. You’ll also want to spend some time getting used to the interface if you’ve not used a font designer before. With glyphr studio you define the outline points and it automatically fills in the space between. Do also remember to define a height (typically automatic) and width (not automatic) so that your glyphs will display properly.glyphr studio interface

You can fine tune the curvature, width etc of the points and once you have the basis down you can always come back to fine tune and tweak the glyphs until you’re satisfied. This is definitely the most time consuming but can be fun once you’re accustomed to the interface.

glyphr studio point close up

Export

Export your file as an .otf (OpenType Font) and fill out the relevant metadata as you see fit. Feel free to also keep a copy of the project file from glyphr studio but you can always edit the .otf file and it should have all the relevant data. This file can be shared and stored and can be included in projects as needed.

Including it in your project

To include your snazzy new .otf file in a web project, you can just add the file to the codebase, preferably under static/fonts. Then, you can import it into your main CSS file as follows:


To utilise it, you can implement a symbol picker (particularly helpful if the symbols don’t have a letter equivalent), add to a special character input in your text editor, or assign it as you would any other font.

Et voila!

Customising Rich Text Editors

By Amanda Ho-Lyn, on 5 June 2024

Why?

Customising rich text editors within web apps offers significant benefits, particularly for researchers in Digital Humanities. While basic text editors fulfil essential functions, rich text editors (RTEs) can incorporate features such as rich text formatting and customisable styling, allowing for clearer and more engaging presentation of textual information. Whilst they provide greater benefits than regular text editors, you can further enhance them through customisation, which can greatly improve the user experience and effectiveness of communication. 

RTEs can be particularly useful in situations where researchers are working with sources where meaning is encoded in the visual layout. They enable researchers to communicate broader and more intricate ideas effectively. Features like embedded multimedia content (photos, videos, links), interactive elements (tables, footnotes), and collaborative editing tools (history, comments) can facilitate the expression and exploration of complex concepts within the text itself. These can also include adding functionality for symbols or text decorators not widely used – helpful with mathematics and ancient languages. See my post on creating custom font symbols for more details.

Customisation can also foster more connections and cohesiveness within the website or application. Integrated features such as cross-referencing, hyperlinking, and version control enhance the interconnectedness of textual content, enabling users to navigate seamlessly between related information and fostering a more immersive and engaging user experience. 

TLDR: Basic text editors are fine for simple input, Rich Text Editors are better for things with more variety and customising them can build on the functionality to enrich the quality of research done.

 

How?

This somewhat depends on the structure of your project, but most likely it will be through a CDN, npm install or directly adding the package to your codebase (in the case of JavaScript). If you have a python based project, odds are you can install through pip.

From experience…

I recommend CK Editor (4) as this has worked well through both implementations and is very stable and well-established, meaning there are many plugins already written that work with it. You can, of course, write your own for even more customised functionality! The 5th version is likely also a good option, but may require a different license. There are different degrees of customisability with CK Editor, from using their basic setups to enabling any and all plugins you can find, which makes it very adaptable to a variety of projects. You could even overhaul the entire thing if you really wanted. It uses a config.js file as default but each initialisation can also be customised on an individual basis, as well.

For example, here are two different builds of the editor:

The first is quite simple, though not the most basic and has extra functionality that isn’t immediately obvious @ mentions and image browsing. The second is more populated (more plugins enabled) and also has some entirely custom functions (the Vs and Us on the second line) for adding diacriticals through Unicode. It also has @ mentions which have been implemented with more complexity due to a greater breadth of items referenced. These were each implemented through different languages (one through Python and one through JS) and whilst the way of configuring the build was different in each project, the results are stable and reliable whilst remaining flexible.

We have also used QuillJS but this was less well established and when it updated to a new major version, the old version stopped working (not great for a research project that is almost entirely text content).

TLDR: There are tonnes of options out there and it’s not always the best idea to go for the one that is the newest and most-cutting edge, particularly if you need something reliable and easy to maintain. I think CK Editor 4 is a solid choice for the next few years since it’s stable and well established, while still being flexible and customisable to your needs.