X Close

UKRI Centre for Doctoral Training in Foundational AI

Home

Menu

Archive for the 'Research paper' Category

ICML 2025: Will Bankes & Masha Naslidnyk

By Claire Hudson, on 25 September 2025

This summer, the CDT supported two of its PhD students, Will Bankes and Masha Naslidnyk, in attending the International Conference on Machine Learning (ICML) 2025 in Vancouver. Set against the city’s beautiful harbourfront and mountain backdrop, ICML gathered thousands of researchers from around the world for a week of talks, posters, and workshops.

Will’s Story: Alignment at ICML
In July I traveled to Vancouver for this year’s International Conference on Machine Learning (ICML). The event was enormous, over 4,000 papers were accepted to the main conference alone, not counting the two full days of workshops and smaller gatherings that followed. The venue, a spacious convention center on the downtown harbourfront near Stanley Park, offered a stunning backdrop of seaplanes, mountains, forests, and wildlife; it was an inspiring place to meet, share ideas, and hear from fellow students, academics, and industry professionals.
I attended to present research from my PhD on a topic called alignment, a concept drawing significant attention in AI. Alignment is about ensuring a chatbot or large language model (LLM) behaves as intended. That’s no small feat: these models learn to generate text from vast internet data, which inevitably includes plenty of problematic material. Alignment is typically applied after training to steer the model’s behavior in a desired direction.
State-of-the-art alignment techniques work by comparing two possible responses to a prompt and indicating which one is better. For example, if you ask, “Which is better, pasta or sushi?” and the preferred answer is “sushi,” the model is nudged toward that choice. But what happens when preferences change? Maybe I had a bad sushi experience and now prefer pasta, how would the model know?
Our work tackles this question with an algorithm we call NS-DPO, which re-weights training comparisons based on how old the underlying preferences are. We show through experiments and theory that this approach helps models adapt when opinions shift over time. For the interested reader a technical and lay summary is available here: https://icml.cc/virtual/2025/poster/44703
Attending a conference like ICML is invaluable as my PhD progresses, offering opportunities to connect with researchers and industry leaders across North America, Asia, and Europe. It was inspiring to exchange ideas on what is arguably one of the most transformative technologies of our era. Despite headlines about corporate power struggles, ethical concerns, and legal battles, meeting researchers who prioritize safety and global perspectives gives me hope that these critical conversations are moving in a thoughtful, inclusive direction. The experience reinforced my belief that collaborative, globally minded research is key to building AI we can all trust.
 
Masha’s Story: Kernels Everywhere
I attended ICML 2025 in Vancouver this July. One of my main interests is kernel methods, and while there wasn’t a session devoted exclusively to kernels, I found a strong theme of kernels as diagnostic and evaluation tools. What follows are my impressions of several papers that stood out.
One highlight for me was Learning Input Encodings for Kernel-Optimal Implicit Neural Representations. The work brought kernels into the theory of implicit neural representations (INRs). By analysing the infinite-width limit, the authors connected INR generalisation to kernel alignment and proposed a Kernel Alignment Regularizer (KAR). They went further with PEAK (Plug-in Encoding for Aligned Kernels), which learns input encodings tuned to the kernel perspective.
Another strand came from evaluation rather than modeling. How Contaminated Is Your Benchmark? Measuring Dataset Leakage in Large Language Models with Kernel Divergence introduced the Kernel Divergence Score (KDS). They compare kernel similarity matrices of embeddings before and after fine-tuning to detect contamination. Because fine-tuning distorts unseen data differently than memorised data, the divergence reflects contamination levels.
On the theoretical side, From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning aimed to reconcile kernel rescaling and kernel adaptation in different scaling regimes. Using statistical field theory, the authors showed that kernel rescaling captures part of the story, but not all—the directional changes matter, especially in nonlinear networks. The framework captures how directional feature learning shapes the covariance of network outputs, something standard rescaling methods overlook.
Equivariance was another recurring topic. Integration-free Kernels for Equivariant Gaussian Process Modelling tackled the problem that equivariant kernels usually require group integrations, which are expensive. Their construction based on fundamental regions of group actions gave integration-free kernels for vector-valued GPs. The empirical applications to velocity fields and dipole moments made a case that this is a practical tool.
Relatedly, Equivariant Neural Tangent Kernels extended NTK analysis to equivariant architectures. They derived explicit NTKs for group convolutional networks, showing that their dynamics match those of data-augmented non-equivariant networks, at least in expectation. The demonstration with plane roto-translations and SO(3) rotations showed that equivariant NTKs outperform plain ones in image and quantum prediction tasks.
 Finally, the kernel and optimal transport crossover appeared in my own work on kernel quantile embeddings, and in Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances. The max-sliced Wasserstein distance has been used as a scalable alternative high-dimensional OT by projecting into 1D, and this work gave it sharper theoretical footing. They provided finite-sample guarantees, clarified complexity (including NP-hardness for KMS-2-Wasserstein), and proposed a semidefinite relaxation. This was a good example of kernels being used in conjunction with OT.
Overall, kernels at ICML 2025 weren’t confined to one track: they showed up in architectures, in theory, and in diagnostics. The unifying theme I took away was that kernels continue to function both as analytical tools and as building blocks, bridging between modern machine learning and more classical statistical ideas.

Together, Will and Masha’s experiences at ICML highlight the diversity of research happening within the CDT. From ensuring AI systems align with human values to advancing the mathematics underpinning machine learning. Both students came away with fresh insights, new collaborations, and a sense of excitement about contributing to one of the most transformative fields of our time.

2nd Bayes-Duality Workshop: Daniel Augusto de Souza

By Claire Hudson, on 15 December 2024

On June 12th to 21st of 2024, I had the pleasure to attend and present my work as a poster for the 2nd Bayes-Duality Workshop 2024 organized by the Bayes Duality, a Japan-French joint research project team. This workshop was hosted in the Centre for Advanced Intelligence Project (AIP) of RIKEN in Nihonbashi, Chūō City, Tokyo.

Nihonbashi is one of the oldest districts of Tokyo, a lively business district where finance and general office workers gather while neighbouring the imperial palace, where the Japanese monarch and his family lives. Feeling out of place in this somewhat non-academic environment, the two-week workshop contained invited talks, panels between speakers, showcase of works done by the Bayes Duality team, and a poster session.

As stated in the program, the workshop focused on the development of AI that learns adaptively, robustly, and continuously, like humans. A common theme in the presentations by collaborators of the Bayes Duality is to explore the mathematical connections between the training data examples of and the model parameters of these machine learning systems. This connection is incredibly desirable due to the following difference in complexity: the current state-of-art models have a vast number of uninterpretable parameters while the data examples can usually still be understood by human experts.

Due to the length of the workshop, the invited talks could cover an extensive range of topics. Such breadth of topics is hard to describe in such post and, most incredibly, none of them felt out of place in this workshop. Starting from the expected topics as the tutorial on the Bayesian learning rule, one of the papers that put together the connections between data-parameter duality, and convex duality, to more general topics in uncertainty quantification, such as Eugene Ndiaye’s tutorial and presentation on conformal prediction, continual learning, and identifiability of parameters in neural network models.

The poster session included works mentioned in the invited talks and others from students like me. I chose to present my progress on “Interpretable deep Gaussian processes for geospatial tasks”; in this project I analyse the issue of interpretability of three commonly used architectures of deep Gaussian processes and try to understand what practitioners really meant by “interpretable” and suggest a different metric than the commonly used. I felt this was the right work to present to the audience of this workshop due to their familiarity with Bayesian deep learning and interest in understanding the parameters of these models. As the only student from UCL, I was happy to display our work and connect with researchers from institutions all over the world, with attendees from the US, Asia, and Europe.

Student presentation – Alex Hawkins Hooker at ISMB

By sharon.betts, on 4 October 2023

In July of 2023, our Cohort 2 student Alex Hawkins-Hooker presented his work at the Machine Learning in Computational and Systems Biology Track at ISMB, which is one of the leading computational biology conferences.
The full paper describing this work ‘Getting personal with epigenetics: towards individual-specific epigenomic imputation with machine learning’, has since been published in Nature Communications here https://www.nature.com/articles/s41467-023-40211-2.
The work was started before Alex came to UCL, but completed during his PhD, so it was done jointly with collaborators at the Max Planck Institute for Intelligent Systems in Tübingen and the University of Dundee.
If you are interested in reading more publications by our outstanding students, do check out our publications page on our website.

“Safe Trajectory Sampling in Model-based Reinforcement Learning for Robotic Systems” By Sicelukwanda Zwane

By sharon.betts, on 29 September 2023

In the exciting realm of Model-based Reinforcement Learning (MBRL), researchers are constantly pushing the boundaries of what robots can learn to achieve when given access to an internal model of the environment. One key challenge in this field is ensuring that robots can perform tasks safely and reliably, especially in situations where they lack prior data or knowledge about the environment. That’s where the work of Sicelukwanda Zwane comes into play.

Background

In MBRL, robots use small sets of data to learn a dynamics model. This model is like a crystal ball that predicts how the system will respond to a given sequence of different actions. With MBRL, we can train policies from simulated trajectories sampled from the dynamics model instead of first generating them by executing each action on the actual system, a process that can take extremely long periods of time on a physical robot and possibly cause wear and tear.

One of the tools often used in MBRL is the Gaussian process (GP) dynamics model. GPs are fully-Bayesian models that not only model the system but also account for the uncertainty in state observations. Additionally, they are flexible and are able to learn without making strong assumptions about the underlying system dynamics [1].

The Challenge of Learning Safely

When we train robots to perform tasks, it’s not enough to just predict what will happen; we need to do it safely. As with most model classes in MBRL, GPs don’t naturally incorporate safety constraints. This means that they may produce unsafe or unfeasible trajectories. This is particularly true during early stages of learning, when the model hasn’t seen much data, it can produce unsafe and seemingly random trajectories.

For a 7 degree of freedom (DOF) manipulator robot, bad trajectories may contain self-collisions.

 

Distributional Trajectory Sampling

In standard GP dynamics models, the posterior is represented in distributional form – using its parameters, the mean vector and covariance matrix. In this form, it is difficult to reason about

about the safety of entire trajectories. This is because trajectories are generated through iterative random sampling. Furthermore, this kind of trajectory sampling is limited to cases where the intermediate state marginal distributions are Gaussian distributed.

Pathwise Trajectory Sampling

Zwane uses an innovative alternative called “pathwise sampling” [3]. This approach draws samples from GP posteriors using an efficient method called Matheron’s rule. The result is a set of smooth, deterministic trajectories that aren’t confined to Gaussian distributions and are temporally correlated.

Adding Safety

The beauty of pathwise sampling [3] is that it has a particle representation of the GP posterior, where individual trajectories are smooth, differentiable, and deterministic functions. This allows for the isolation of constraint-violating trajectories from safe ones. For safety, rejection sampling is performed on trajectories that violate safety constraints, leaving behind only the safe ones to train the policy. Additionally, soft constraint penalty terms are added to the reward function.

Sim-Real Robot Experiments

To put this approach to the test, Zwane conducted experiments involving a 7-DoF robot arm in a simulated constrained reaching task, where the robot has to avoid colliding with a low ceiling. The method successfully learned a reaching policy that adhered to safety constraints, even when starting from random initial states.

In this constrained manipulation task, the robot is able to reach the goal (shown by the red sphere – bottom row) without colliding with the ceiling (blue – bottom row) using less than 100 seconds of data in simulation.

Summary

Sicelukwanda Zwane’s research makes incremental advances on the safety of simulated trajectories by incorporating safety constraints while keeping the benefits of using fully-Bayesian dynamics models such as GPs. This method promises to take MBRL out of simulated environments and make it more applicable to real-world settings. If you’re interested in this work, we invite you to dive into the full paper, published at the recent IEEE CASE 2023 conference.

References

 

  1. M. P. Deisenroth and C. E. Rasmussen. PILCO: A Model-based and Data-efficient Approach to Policy Search. ICML, 2011.
  2. S. Kamthe and M. P. Deisenroth. Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control. AISTATS, 2018.
  3. J. T. Wilson, V. Borovitskiy, A. Terenin, P. Mostowsky, and M. P. Deisenroth. Pathwise Conditioning of Gaussian Processes. JMLR, 2021.

 

Welcome to our blog!

By Sharon C Betts, on 18 November 2021

Welcome to the blog for the UKRI Centre for Doctoral Training in Foundational Artificial Intelligence.

Our aim with this blog is to inform the wider world of our research, ourselves and our ambitions in helping create new algorithms and foundational practices in artificial intelligence to help deliver the UK National Strategy in artificial intelligence.

Our CDT is one of 16 UKRI funded CDT’s focusing on artificial intelligence and building on the UK’s history of excellence with machine learning.

The UKRI CDT in Foundational AI sits within UCL’s Centre for Artificial Intelligence in the heart of London and helps bring together some of the best minds in the field of machine learning, natural language processing, robotics, deep learning and reinforcement learning (and so much more!)

We look forward to letting you know more about us and what we are doing to help forward the research in artificial intelligence and create new frontiers in research.