Using continuous integration efficiently

By Matt Graham, on 3 November 2023

Use of continuous integration (CI) is an important part of creating robust and reproducible research software, however running automated jobs on services such as GitHub Actions and GitLab CI/CD comes with an associated energy, and so environmental, cost.

As an example, in the 30 days from 2nd October to 1st November 2023, the UCL/TLOmodel repository ran GitHub Actions workflow jobs which used 1937 hours of runner time. As a (very rough) back-of-the-envelope estimate, assuming an average runner power consumption of 12W^[1] this equates to a total monthly CI energy usage for this project of 23kWh, which is about 10% of a typical UK’s households monthly electricity usage or the monthly energy usage of around 8 UCL employee’s laptops^[2].

Below are some simple ways to reduce GitHub Actions and GitLab CI/CD runner usage without compromising the gains that automated testing and deployment brings. These approaches also have the side benefit for private GitHub repositories of reducing usage of the free Actions minutes quota.

Automatically cancel redundant jobs
For jobs triggered by pushing commits to a pull or merge request branch, we may push new commits and trigger new job runs while previous runs are still in progress. Typically we will only care about the job results on the latest commit, and so the previous job runs will be redundant. The concurrency.group and concurrency.cancel-in-progress properties in GitHub Actions workflows can be used to automatically cancel already in progress job runs at different grouping levels – for example for workflows triggered for the same pull request, branch or tag. In GitLab, jobs which have the interruptible property set to true will be automatically cancelled when the Auto-cancel redundant pipelines project setting is enabled. Both GitHub and GitLab also allow manually skipping job runs when pushing new commits by including a token such as [skip ci] or [ci skip] in the commit message.
Use filters and rules to only run jobs when needed
Commonly some checks are only relevant to a subset of the files in a repository. For example if a branch only involves changes to a Markdown README file we probably do not need to run a test suite for the source code in a repository. In GitHub Actions we can use the optional paths and paths-ignore properties of the push and pull-request triggers to only run workflow jobs when the files changed do or do not match one or more patterns. Similarly in GitLab we can use rules to set conditions on a when a job runs, with specifically the rules:changes property allowing limiting runs to when files matching specified patterns are changed. It may also make sense to have jobs only run when pull requests are no longer drafts and are marked as ready to review. This can be achieved in GitHub Actions using a combination of the on.pull_requests.types property and using an if condition on the job that checks github.event.pull_request.draft is false.
Set the scheduled job frequency appropriately
Both GitHub Actions and GitLab CI/CD pipelines allow running jobs on a schedule using crontab syntax. While we sometimes refer to nightly jobs, it is worth considering carefully what the appropriate frequency is for such scheduled jobs based on the use case and wider context of the project, and whether running jobs at for example a weekly frequency might be sufficient. For example running daily jobs to check whether tests for a package pass against the latest compatible versions of upstream dependencies may make sense for a a package with a large userbase and development team where breaking changes in upstream packages are both likely to be encountered in practice quickly, and where there is likely to be sufficient developer capacity to resolve the issues in a similar timeframe. On the other hand for a package with fewer users and contributors, having a test job identify such breaking changes at a weekly frequency may be sufficient.
Cache job dependencies
A common step in CI jobs is setting up dependencies on the runner, often using a language specific package manager like npm or pip. Rather than redownloading and building dependencies afresh on each run, we can use caching features to reuse the dependencies built in previous jobs (providing the new run uses the same versions of dependencies). The GitHub cache action provides a generic approach for performing caching in GitHub Actions jobs, while language specific setup actions such as setup-python have built in support for setting up dependency caching. GitLab similarly provides support for caching dependencies.
Set timeouts to avoid misbehaving jobs running for long times
The default timeout for GitHub Actions jobs is 6 hours. If you expect a job to run in much less time than that, setting a more conservative timeout can avoid misbehaving jobs which hang or run very slowly, from inadvertently using a large amount of runner time. The default timeout on GitLab is shorter at 60 minutes but can similarly be adjusted.
Chain jobs intelligently and fail fast
Commonly we will run an assortment of test and checks as part of CI workflows with a corresponding wide range of run times. Code formatting and linting checks will typically be the quickest to run, while test suites may contain both quicker unit tests as well as integration and system tests which can take longer to run. Ensuring faster running checks and tests run first, and halting further jobs from running if these fail, will avoid wasting compute time running slow tests unnecessarily before changes to fix the faster running checks are made. Taking this one step further, for very quick checks such as linting and formatting, frameworks like pre-commit can be used to ensure checks are run automatically when committing changes, reducing the chance of CI jobs failing and needing to be rerun in the first place. In GitHub Actions the jobs.<job_id>.needs property can be used to specify that other jobs must successfully complete before another job runs. When running a matrix strategy job which creates one job run for each of a set of configurations (corresponding to for example different versions, operating systems or groups of tests) the jobs.<job_id>.strategy.fail-fast property can be set to true to cancel any other still in progress job runs in the matrix if a single run fails. GitLab pipeline jobs also have a needs property for specifying dependencies.

Estimating the power usage of a CI job runner is complicated as typically jobs will be run on virtual machines (VMs) on cloud-hosted servers, with generally there being multiple VMs running in parallel on each server and potentially multiple job runners per VM. The analysis in Aldossary and Djemame (2019) suggests a VM with 4 virtual central processing units (CPUs) running on a host with eight-core Intel Xeon E3-1230 V2 CPU and 16GB of DDR3 RAM when running with an active load with 80% CPU utilization can be attributed a mean power consumption of around 40W (see Figure 7). If we assume four job runners per VM (one per virtual CPU) this corresponds to around 10W per runner. To account for the additional power overhead for data centre infrastructure such as cooling, we assume a power usage effectiveness of 1.2 (based on figures provided by Microsoft for Azure), giving an average overall power draw of 12W per runner. ↩︎
This is based on an assumption of an average power draw of 20W (under a mix of idling and working at load) for a Dell Latitude 5410 laptop with a 36.5 hour working week and four weeks in a month giving an estimate of 2.92kWh used per laptop per month. ↩︎

Filed under GitHub Actions, GitLab CI/CD

Comments are closed.

Using continuous integration efficiently

Recent Posts

Archives

Categories

Meta