X Close

Centre for Advanced Research Computing

Home

ARC is UCL's research, innovation and service centre for the tools, practices and systems that enable computational science and digital scholarship

Menu

Archive for the 'Uncategorized' Category

The ARC TRE

By George Svarovsky, on 28 November 2024

The Advanced Research Computing Trusted Research Environment (ARC TRE) is available for UCL researchers. It can be found at https://tre.arc.ucl.ac.uk/.

What is the ARC TRE?

The ARC TRE provides compute environments, with desktops for each researcher, for projects working on sensitive data. You can think of each project space as a secured virtual room with computers and a network – and strict rules about who can bring data in and take data out. It’s a safe setting, where safe projects with safe people can work on safe data and produce safe outputs.

We’re now onboarding projects for which the environment is suitable, via the Information Governance assurance process. Suitability of the environment will expand through 2025, with certification for NHS data early in the year, and ISO27001 certification in mid-year; so please do consider the environment for projects in planning.

How is it different?

In designing a next-generation environment for research computing on sensitive data, ARC had some ambitious principles in mind.

One size does not fit all. The environment allows projects to choose the compute resources they need, and also the tools, and even their internal governance and fine-grained data permissions. Changes in one project do not affect any other project.

For science to be reproducible, its software setup must be too. The TRE provides projects with Container technology (Docker/Podman and Snap), a key enabler for making software setup repeatable. Not everyone will be familiar with this technology! – with this design decision, ARC is doubling down on its mission: to support researchers making best use of the tools, practices and systems that enable computational science and digital scholarship. Our Collaborations & Consultancy team are ready to help.

Unbounded research needs unbounded compute. ARC is investing in research compute, and these platforms should be just as available for use with sensitive data as for any other research. The TRE uses cloud practices to provision the required security, initially on a public cloud platform, AWS – giving virtually infinite capacity – and, in future, onto ARC’s research compute and storage platforms – giving access to high-performance compute at much lower cost.

Who can use it?

The ARC TRE is running under a new Information Security Management System (ISMS), which is in the process of being operationalised at ARC. This means the environment is not yet ready to take NHS data, or data that explicitly requires ISO27001 certification, but both certifications are planned for the first half of 2025. So, if your data does not require these certifications, or you are planning research for later in 2025, you can consider using the ARC TRE. In any case, the Information Governance assurance team or the TRE team will be happy to advise on the suitability of the environment for your research.

In terms of costs, the ARC TRE has a straightforward and permissive pricing structure:

  • For basic usage (a moderately powerful t3a.medium desktop per user, and 100GB of project storage), the ARC TRE is free.
  • If you require more than this, we will ask you to include some “facility” costs in your funding. We use a transparent model based on AWS pricing. As an example, requesting a p3.2xlarge (with a V100 GPU with 16GB VRAM) will cost about £140 a month.
  • We will not rescind access or charge you, even if you end up using more compute than originally estimated, or if the AWS price fluctuates. (In the case of accidental or unreasonable overuse, we’ll have a polite conversation!)

Early adopters of the ARC TRE might find they are talking to the team, and influencing the development of the environment, more than they expect! The project is run using Agile and DevOps, and we will continuously improve the platform (always taking great care to ensure projects in flight are not adversely affected).

What’s in it for ARC?

ARC is UCL’s research, innovation and service centre for the tools, practices and systems that enable computational science and digital scholarship. We expect concrete learning from and for each TRE project: benefiting both ARC and the research teams.

As ARC collaborates on different projects with their own configurations and procedures to solve their own unique challenges, we will be building up an armoury of configurations and procedures that can be used again. A configuration that worked well with one dataset will be invaluable for another project with a similar dataset. Data owners will find that we can meet their needs repeatedly and repeatably.

As we are expecting to continue to manage this service through many hundreds of projects, we will be able to take the knowledge learnt from one project into future projects, as well as working out how to gather and organize this information to improve continuously our information security and user experience.

Project Example: Machine-Learning Assisted Diagnosis

We are engaged with a research and innovation group with an ambition to provide a mobile phone-based solution for assisting the diagnosis of jaundice in babies, a prime indicator of liver disease later in life. This project will be supported by the TRE through its innovation journey from research to spin-out:

  1. Consented patient photographs are captured and ingressed into the TRE, and medical records ingressed from the hospital.
  2. Machine learning (ML) models can be developed in a TRE Project using the babies’ sensitive data.
  3. The ML models can continue to be refined in the TRE while and after the production system is built on the public AWS cloud, with new model versions being egressed and deployed.
  4. Learnings and components from the TRE Project can be shared and reused, including best practices, and hospital data pipeline integration.

What about the Data Safe Haven?

The current Data Safe Haven was designed over ten years ago. Its environment meets many research requirements, but it was not designed to support more modern data handling methodologies. The ARC TRE will provide a modern computing environment with the additional flexibility and power to support those UCL researchers for whom the Data Safe Haven is limiting, providing the same level of Information Governance assurance. Development is under way and is being carried out in an iterative fashion, with new features being rolled out over time.

UCL ARC and ISD are committed to supporting the Data Safe Haven until the ARC TRE is ready to take on all existing researchers. As new features are available in the TRE, research studies will be migrated in a controlled and supported manner. At this stage it is hard to put a timescale on this. Any changes will be fully supported with training and assistance.

What’s next?

As mentioned, the team is set up to continuously improve the ARC TRE. We have several capabilities that we know will be needed, and we will work with researchers to refine these requirements and to discover new ones.

Machine ingress. We’re already working on a secure API so that data can be fed into Projects without human interaction.

More shared resources. ARC and researchers can make contributions to the resources available to all ARC TRE projects, including common software tools and reference data. We expect to greatly expand the available resources over the coming months.

Inter-project sharing. We intend to allow projects with the requisite information governance to exchange data without it having to leave the environment.

Project templates. We strongly encourage projects in the ARC TRE to use software version control for their project materials. This is an essential tool for collaboration; and provides a history to aid transparency and recall of decisions. ARC will demonstrate how to re-use existing project setup repositories, which can become templates for multiple projects.

Archiving. The ARC TRE is not intended for long-term storage of research data; instead, it’s a compute environment for active research. However, many projects require archival of sensitive data. For now, researchers can leave their projects on the TRE, where the data is securely backed-up. We’re working with the ARC Data Stewards to ensure that project archival requirements can be met optimally in future.

Windows desktops. The ARC TRE provides Red Hat Enterprise Linux desktops. Our initial user feedback has been (slightly surprisingly) positive about this decision – but we know that many projects in the Data Safe Haven make use of Windows tools. A managed Windows environment for sensitive data will continue to exist, for long as it’s needed – design of the long-term solution is ongoing.

Thanks

A great many people have been involved in this project, at ARC, ISD, ISG and beyond, and we are grateful to one and all.

The core technical team of Tom Young (lead), Robert Vickerstaff, and Sunny Park; enabled by delivery manager Rupert Roker. Architecture and security management system led by Tim Machin, ably supported by Finley Bacon who also took on the service management; and the ARC information governance team Martin Donnelly, Preeti Matharu and Victor Olago. Ongoing and valued expertise from Jack Hindley, Dan Bretherton, Trevor Peacock, Anthony Peacock (no relation), David Solomon, Werner Niekirk, Ben Thomas, and others who have been pestered about one thing or another since January. User representation from Nick Owen, Socrates Varakliotis, John Watts, Michelle Harricharan, and many more. And the leadership that sponsored and fought for the project: Donna Swan, Rachel Cooper, and James Hetherington.

Want to know more?

The environment is visible to anyone with a UCL login, at https://tre.arc.ucl.ac.uk/

Information on Sensitive Data and Trusted Research Environments

Homepage of the Trusted Research Environment Assurance process

George Svarovsky is a Principal Research Software Engineer working in UCL’s Centre for Advanced Research Computing. George is the Product Owner for the Trusted Research Environment, working with the service’s users to understand their needs and ensure that the service his team builds meets those needs. If you would like to get in touch with George to talk about how the TRE can best support your research then please contact the team.

Introducing: High Level Designs!

By Amanda Ho-Lyn, on 23 September 2024

What?

High level designs give an overall idea to how a project and its components are structured, typically with the aid of a diagram. If you’ve worked with AWS or the CDI, you’ve probably seen some architecture diagrams outlining the flow of the app with various services – this is the sort of idea we’re talking about.

Why?

Following our previous survey surrounding how knowledge is currently shared within the department, we found that many of us care about high level designs of projects but we don’t actually know what they are for most of them unless we’re working on them.

As such, we decided to add some guidance around how to add this to the ops tracker as well as some examples of how you might want to structure your design info in the issue and the repo. We’re hoping that this is a light and simple change that people are happy to adopt since it should help us understand projects better at a high level – hopefully reducing the number of times we jump onto a project only to be overwhelmed and confused by how things are set up and trying to blindly feel our way around in the hopes of understanding.

That being said, the main burden will fall to project managers who will be tasked with making sure the designs are [still] relevant each TI and if they don’t exist, maybe it might be time to add them.

This will also result in a small update to the issue template – an aspect of the ops tracker and a key component of how we share knowledge within ARC – which we’d like your feedback and thoughts on in this survey if you’ve not yet completed it.

 

Recruiting Three Research Software Developers and a Senior Research Software Developer

By James P J Hetherington, on 25 January 2017

The UCL Research Software Development Group, founded in 2012, was the first of its kind, and is one of the leading university-based research programming groups in the UK.

We work across college developing high-quality software in collaboration with scientists, engineers and scholars from all research domains.

Whether this means using Python to build up a database of ancient Sumerian writings, parallelising Fortran codes for surface catalysis simulations, adding red blood cells to a supercomputing simulation of brain blood-flow, refactoring DNA forensics code in R, or designing and building a big data image processing library in C++11 and Python, we do it all, bringing specialist programming expertise, modern development practices (CI, TDD, Agile…), and engineering rigour to academic software. We provide expert software engineering consulting services to world-leading research teams, and work with scientists and scholars to build software to meet new research challenges.

If the following describes you, then you should consider working with us:

  • You have created and maintained software to address advanced research problems in one or more fields
  • You can rapidly assimilate understanding of new scientific questions, and quickly connect research needs to software requirements
  • You are committed to software development best practices, and know how to adapt these to research contexts
  • You are expert in one or more languages and platforms used for scientific computing, and are keen to expand your knowledge.

Three research software developer posts are available. For more details check this advert. Three positions are available. These are permanent but subject to the availability of continued funding sourced from appropriate research collaborations, and are funded for two years in the first instance.

In addition, a senior research software developer post is available. The details are in this advert. This is a permanent position.

In addition, we will shortly be advertising a second senior position, to lead our new Data Science team, working closely with the UCL Centre for Data Science.

UCL welcomes EPSRC Research Software Engineering Fellows

By James P J Hetherington, on 15 May 2015

EPSRC has announced a pilot call for Research Software Engineer Fellowships .

This is an exciting step towards the development of a long-term career path for research programmers, something we’ve been struggling towards for a long time.

UCL will be very keen to support the best candidates in applications to this fellowship.

We’re committed to a dual academic-and-support model for the RSE role, so we welcome great candidates who seek a joint position in both the Research Software Development Group and a UCL academic department. Interested candidates should get in touch with me at j.hetherington@ucl.ac.uk. I can help you make contact with departmental colleagues.

UCL is one of the best places on the planet to be a programmer-researcher.  We think future leaders of the research programming community will benefit from basing their fellowship in our growing, boundary-pushing research software development group.

UCL Affiliates with Software Carpentry Foundation

By James P J Hetherington, on 22 January 2015

Software Carpentry is a volunteer organisation whose goal is to make scientists more productive, and their work more reliable, by teaching them basic computing skills. Founded in 1998, it runs short, intensive workshops that cover program design, version control, testing, and task automation. In October 2014, a non-profit Software Carpentry Foundation was created to act as a governing body for the project.

UCL is London’s Global University, founded to open up education to all on equal terms. Today our outstanding research and innovative teaching drive entrepreneurial solutions to the world’s major problems.

UCL Research IT Services is UCL’s organisation supporting and enabling computational research. Committed to sharing the most effective ways of using computers for research, we offer practical, hands-on training in high performance cluster computing and research software development. We are excited to be part of the software carpentry community, and organise software carpentry events for UCL students and research staff three times per year.

By affiliating with the Software Carpentry Foundation, organisations demonstrate their commitment to teaching best practices for scientific computing, contribute charitably to support a sustainable future for the organisation, and help shape the future of digital research.

UCL is proud to become an affiliate of the Software Carpentry Foundation

(See original news posting)

UCL Research Software Dashboard Developer

By James P J Hetherington, on 9 December 2014

The University College London Research Software Development Initiative is seeking a full-stack web developer to work on its Research Software Dashboard project from January 2015 to July 2015.

This is a new project, starting from scratch, to develop software to curate, promote, and manage the University’s wide portfolio of cutting-edge scientific and scholarly software.

The project will provide an overview of the research software output of the college for scientists, managers, funders, investors and clients, including both open-source software and software being commercialised through the university’s business and consulting arms. It will integrate with the University’s code management infrastructure, based on GitHub Enterprise, software testing infrastructure based on Jenkins, and commercial software sales platform e-Lucid.

Software is an increasingly important scholarly output for research alongside publications, and this project will help retain UCL’s leadership in this important aspect of twenty-first century research.

Person Specification

Essential:

  • Full-stack web development expertise
    • Design and layout skills (sass preferred)
    • Client-side Javascript. (jquery preferred)
    • Server-side programming skills with an open source dynamic framework
      • Python with Django or Pyramid preferred
  • API design and integration skills
    • Experience with RESTful API design and consumption
  • Commitment to software development best practice
    • Test driven design
    • Continuous integration

Desirable:

  • DevOps and systems skills
    • Automated deployment (preferably Puppet)
    • Web server management and configuration (Apache or Nginx)
    • Virtualisation and containerisation (vagrant)
  • Experience of research (e.g. MRes, MPhil, PhD…)
  • Experience of semantic web technologies
  • Agile software process experience (e.g. Scrum or XP…)
  • Experience with the GitHub API
  • Experience with Jenkins continuous integration framework
  • High-performance computing experience (Schedulers, MPI)
  • Understanding of UK Research ecosystem (REF, RCUK funding models, research metrics)
  • Experience with technology transfer and research commercialisation

Those interested in being involved in this important project on a freelance or contractor basis should get in touch with James Hetherington (j.hetherington@ucl.ac.uk) for more information.

UCL Research IT helps to win big-data research grant

By James P J Hetherington, on 1 December 2014

Compressive sensing is a recent breakthrough in information theory that has the potential to revolutionise the acquisition and analysis of data in many fields.  UCL Research IT Services (RITS) are an integral part of a team that recently secured grants from the UK research councils to develop compressive sensing techniques to address the challenge of extracting meaningful information from big-data.

The techniques developed will find application in a broad range of academic fields and industries, from astronomy to medicine.  They will allow high-fidelity astronomical images to be recovered from the overwhelming volumes of raw data that will be acquired by next-generation radio telescopes like the Square Kilometre Array (SKA).  The new techniques will also be of direct use in neuro-imaging to accelerate the acquisition time of diffusion magnetic resonance imaging (MRI), potentially rendering its clinical use possible.

Dr Hetherington of RITS’s Research Software Development Team said: “Software which embodies these research results will be readable, well-engineered and efficient to run on the world’s fastest computers, and will be an important output of this research, alongside research publications.” Funding from the Engineering and Physical Sciences Research Council’s Software For the Future Programme which emphasises the importance of sustainable research software, will be used to fund members of the RITS team working to this aim.

Dr McEwen of UCL’s Mullard Space Science Laboratory (MSSL) and leader of the project said: “UCL’s Research Software Development Team, led by Dr Hetherington, were an invaluable partner in securing funding for this project and will be instrumental in its success. Their unique expertise in combining both scientific computing and professional software development will ensure that the software produced will be able to fully exploit high-performance computing architectures, while also being readily usable by the community.”

For more details see: http://www.ucl.ac.uk/mathematical-physical-sciences/maps-news-publication/maps1431

Square Kilometre Array

Research Computing Seminar

By Susanne Claus, on 21 October 2014

The next research computing seminar will take place, Friday (24th of October), 1pm-2pm.
Professor Daniel Kroening , Department of Computer Science, University of Oxford, will present a seminar entitled “Reasoning about IEEE Floating-point Arithmetic with Abstract Conflict-Driven Learning”.

Date: Friday, 24th of October 2014
Place: Wilkins Haldane Room
Time: 1:00pm – 2:00pm

Abstract:

Conflict-Driven Clause Learning is the technique that is enabling for the high performance of propositional satisfiability solvers. I will present a generalisation of Conflict-Driven Clause Learning that permits reasoning within an abstraction. I will show an exemplar, the domain of intervals over floating-point values, and its application to difficult problems in program analysis.

Speaker Bio:

Professor Daniel Kroening’s research focus is formal methods for the correct construction of hardware and software systems, with a focus on automated methods for checking compliance of an implementation with a specification. Techniques include model checking and automated testing.

He is particularly interested in applying these methods to practical hard- and software implementations given in languages like C or C++, or HDLs such as Verilog and SystemC.

Recruiting a Research Software Developer

By James P J Hetherington, on 5 August 2014

The UCL Research Software Development Team works with researchers across college to ensure UCL retains the highest standards of excellence in computational research. We are recruiting one additional research software developer for the team.

In this role, you will design, extend, refactor, and maintain scientific software across all subject areas. You will modify legacy software to run on state-of-the-art high performance computing infrastructure, provide expert software engineering consulting services to world-leading research teams, and work with researchers to build software to meet new research challenges.

You have created and maintained software to address advanced research problems in one or more fields. You can rapidly assimilate understanding of new scientific questions, and quickly connect research needs to software requirements. You are committed to software development best practices, and know how to adapt these to research contexts. You are expert in one or more languages and platforms used for scientific computing, and are keen to expand your knowledge.

For full details, see the advert.

Recruiting a Research Software Developer

By James P J Hetherington, on 3 January 2014

The UCL Research Software Development Team works with researchers across college to ensure UCL retains the highest standards of excellence in computational research. We are recruiting one additional research software developer for the team, appointed for two years in the first instance.

In this role, you will design, extend, refactor, and maintain scientific software across all subject areas. You will modify legacy software to run on state-of-the-art high performance computing infrastructure, provide expert software engineering consulting services to world-leading research teams, and work with researchers to build software to meet new research challenges.

You have created and maintained software to address advanced research problems in one or more fields. You can rapidly assimilate understanding of new scientific questions, and quickly connect research needs to software requirements. You are committed to software development best practices, and know how to adapt these to research contexts. You are expert in one or more languages and platforms used for scientific computing, and are keen to expand your knowledge.

For full details, see the advert.