This summer, the CDT supported two of its PhD students, Will Bankes and Masha Naslidnyk, in attending the International Conference on Machine Learning (ICML) 2025 in Vancouver. Set against the city’s beautiful harbourfront and mountain backdrop, ICML gathered thousands of researchers from around the world for a week of talks, posters, and workshops.
Will’s Story: Alignment at ICML
In July I traveled to Vancouver for this year’s International Conference on Machine Learning (ICML). The event was enormous, over 4,000 papers were accepted to the main conference alone, not counting the two full days of workshops and smaller gatherings that followed. The venue, a spacious convention center on the downtown harbourfront near Stanley Park, offered a stunning backdrop of seaplanes, mountains, forests, and wildlife; it was an inspiring place to meet, share ideas, and hear from fellow students, academics, and industry professionals.
I attended to present research from my PhD on a topic called alignment, a concept drawing significant attention in AI. Alignment is about ensuring a chatbot or large language model (LLM) behaves as intended. That’s no small feat: these models learn to generate text from vast internet data, which inevitably includes plenty of problematic material. Alignment is typically applied after training to steer the model’s behavior in a desired direction.
State-of-the-art alignment techniques work by comparing two possible responses to a prompt and indicating which one is better. For example, if you ask, “Which is better, pasta or sushi?” and the preferred answer is “sushi,” the model is nudged toward that choice. But what happens when preferences change? Maybe I had a bad sushi experience and now prefer pasta, how would the model know?
Our work tackles this question with an algorithm we call
NS-DPO, which re-weights training comparisons based on how old the underlying preferences are. We show through experiments and theory that this approach helps models adapt when opinions shift over time. For the interested reader a technical and lay summary is available here:
https://icml.cc/virtual/2025/poster/44703
Attending a conference like ICML is invaluable as my PhD progresses, offering opportunities to connect with researchers and industry leaders across North America, Asia, and Europe. It was inspiring to exchange ideas on what is arguably one of the most transformative technologies of our era. Despite headlines about corporate power struggles, ethical concerns, and legal battles, meeting researchers who prioritize safety and global perspectives gives me hope that these critical conversations are moving in a thoughtful, inclusive direction. The experience reinforced my belief that collaborative, globally minded research is key to building AI we can all trust.
Masha’s Story: Kernels Everywhere
I attended ICML 2025 in Vancouver this July. One of my main interests is kernel methods, and while there wasn’t a session devoted exclusively to kernels, I found a strong theme of kernels as diagnostic and evaluation tools. What follows are my impressions of several papers that stood out.
One highlight for me was Learning Input Encodings for Kernel-Optimal Implicit Neural Representations. The work brought kernels into the theory of implicit neural representations (INRs). By analysing the infinite-width limit, the authors connected INR generalisation to kernel alignment and proposed a Kernel Alignment Regularizer (KAR). They went further with PEAK (Plug-in Encoding for Aligned Kernels), which learns input encodings tuned to the kernel perspective.
Another strand came from evaluation rather than modeling. How Contaminated Is Your Benchmark? Measuring Dataset Leakage in Large Language Models with Kernel Divergence introduced the Kernel Divergence Score (KDS). They compare kernel similarity matrices of embeddings before and after fine-tuning to detect contamination. Because fine-tuning distorts unseen data differently than memorised data, the divergence reflects contamination levels.
On the theoretical side, From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning aimed to reconcile kernel rescaling and kernel adaptation in different scaling regimes. Using statistical field theory, the authors showed that kernel rescaling captures part of the story, but not all—the directional changes matter, especially in nonlinear networks. The framework captures how directional feature learning shapes the covariance of network outputs, something standard rescaling methods overlook.
Equivariance was another recurring topic. Integration-free Kernels for Equivariant Gaussian Process Modelling tackled the problem that equivariant kernels usually require group integrations, which are expensive. Their construction based on fundamental regions of group actions gave integration-free kernels for vector-valued GPs. The empirical applications to velocity fields and dipole moments made a case that this is a practical tool.
Relatedly, Equivariant Neural Tangent Kernels extended NTK analysis to equivariant architectures. They derived explicit NTKs for group convolutional networks, showing that their dynamics match those of data-augmented non-equivariant networks, at least in expectation. The demonstration with plane roto-translations and SO(3) rotations showed that equivariant NTKs outperform plain ones in image and quantum prediction tasks.
Finally, the kernel and optimal transport crossover appeared in my own work on kernel quantile embeddings, and in Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances. The max-sliced Wasserstein distance has been used as a scalable alternative high-dimensional OT by projecting into 1D, and this work gave it sharper theoretical footing. They provided finite-sample guarantees, clarified complexity (including NP-hardness for KMS-2-Wasserstein), and proposed a semidefinite relaxation. This was a good example of kernels being used in conjunction with OT.
Overall, kernels at ICML 2025 weren’t confined to one track: they showed up in architectures, in theory, and in diagnostics. The unifying theme I took away was that kernels continue to function both as analytical tools and as building blocks, bridging between modern machine learning and more classical statistical ideas.
Together, Will and Masha’s experiences at ICML highlight the diversity of research happening within the CDT. From ensuring AI systems align with human values to advancing the mathematics underpinning machine learning. Both students came away with fresh insights, new collaborations, and a sense of excitement about contributing to one of the most transformative fields of our time.