By James P J Hetherington, on 13 February 2015
The UCL Research Software Development Team is building automated testing and deployment infrastructure for cutting edge highly parallel scientific software developed by the university’s research community.
For these state of the art simulations running on local and national supercomputing infrastructure, the research community is increasingly employing continuous integration and automated testing approaches to ensure that the software implements the intended mathematics and science correctly.
We are therefore constructing a platform to allow continuous integration of scientific software at high core counts, and with a variety of platforms, toolchains, and environments. We are seeking a skilled contract systems engineer and DevOps specialist to help us with this.
The appointed contractor will work with the Research Software Development Team and the Research Computing Platforms Team, who manage UCL’s supercomputers, to extend our Jenkins-based automated testing platform to make use of the power of our 7000 core cluster, to support virtualisation and containerisation for reliable independence of test payloads, to add to the capability of the platform by configuring test nodes for a variety of platforms, and to discover, manage, and authenticate tests and users from across the college’s research portfolio.
Please contact rc-softdev (at) ucl.ac.uk if you are interested.
- Advanced Linux systems administration experience, especially RHEL, Scientific Linux
- Advanced automation with Puppet
- Containerisation and virtualisation with Vagrant and Docker
- Virtual machine management on Linux with KVM/QEmu.
- CI configuration with Jenkins
- Version control with Git and GitHub
In each of the above cases we would consider candidates with extensive experience in an alternative toolchain and basic familiarity with our preferred technology (e.g. we would prefer Puppet experts, but would consider candidates with Chef expertise and basic familiarity with Puppet)
- Configuration of cluster schedulers, especially Oracle Grid Engine.
- Distributed file systems (especially Lustre)
- Broad familiarity with a variety of language and tool families used in research (e.g. Python, Pip; Ruby, Gems; Perl, Cpan; Haskell, Cabal; Java, Maven; Fortran/C++, CMake)
- Integration of Windows systems with Linux automation environments (puppet on windows, BitVise SSH, PowerShell, Chocolatey)
- OSX admin (brew, cask)