PhD Researchers on the move: Journey to Vancouver to attend Neurips
By Claire Hudson, on 31 January 2025
Last month, CDT students travelled to Vancouver to present their work at NeurIPS, one of the largest AI conferences. The schedule was packed, with 6 conference papers and 3 workshop papers presented. The CDT is proud to sponsor and support our PhD students as they present their research at the conference, showcasing their hard work and academic excellence on an international stage!
Day 1, Tuesday: No papers were presented, but the expo, tutorials, and careers fair kept everyone occupied.
Day 2, Wednesday:
William Bankes William and his supervisor presented their work on robust data down sampling. Naive approaches to training on a subset of data can cause problems when classes in the dataset are imbalanced, with rare classes becoming even rarer. A direct result of this is the model’s ability to predict on this subset of the data decreases. To address this they proposed an algorithm called REDUCR, which downsamples data in a manner that preserves the performance of minority classes within the dataset. They show REDUCR works across a range of text and image problems achieving state of the art results. The work is available here.
Jake Cunningham also presented his work on
Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling
Day 3, Thursday
Yuchen Zhu
Yuchen’s joint paper with Jialin Yu and Ricardo Silva from UCL Statistical Sciences, Structured Learning of CompositionalSequential Models was presented as part of the main proceedings of NeurIPS 2024. They proposed an explicit model for expressing how the effect of sequential interventions can be isolated into modules, clarifying previously unclear data conditions that allow for the identification of their combined effect at different units and time steps. The paper is here. Additionally, together with collaborators from MPI Tuebingen, Yuchen presented a paper Unsupervised Causal Abstraction at the Causal Representation Learning workshop. Due to the lack of interpretability of current large blackbox models, they propose a methodology for causally abstracting a large model to a smaller and more interpretable model. In particular, unlike existing methods, their method does not require supervision signals from the smaller model. The paper can be found here.
David presented a paper along with co-author Daniel Tan, another UCL PhD Student. They find that Contrastive Activation Addition (CAA) steering has mixed results in terms of robustness and reliability. Steering vectors tend to generalise out of distribution when they work in distribution. However, steerability is highly variable across different inputs: depending on the concept, spurious biases can substantially contribute to how effective steering is for each input, presenting a challenge for the widespread use of steering vectors. While CAA is effective on some tasks, other behaviours turn out to be unsteerable. As a result, it is difficult to ensure they will be effective on a given task of interest, limiting their reliability as a general alignment intervention. The paper is available here.
Day 4, Friday
Reuben Adams
Reuben presented his paper (co-authored with his supervisors) on extending a classic theorem in the PAC-Bayes literature to account for arbitrary outcomes, rather than simply correct or incorrect classifications. Their work provides theoretical insight into the generalisation behaviour of neural networks and the different kinds of errors they can make. Their framework can cover not just Type I and Type II errors, but any kind of error that may occur in multi class classification. You can find the paper here.
Daniel Augusto
Daniel presented a co-authored paper in collaboration with
Getúlio Vargas Foundation for the main conference track. Their work proposes a new solution to streaming variational Bayesian inference using GFlowNets as a foundation for their methodology. This was the first work that allows high quality variational inference for discrete parameters without requiring the storage of the whole dataset. They believe this work will be useful for applications in genetics, through the inference of phylogenetic trees, for preference learning, and other big-data contexts. Their paper can be read here.
Day 5, Saturday
Oscar Key Oscar, along with co-authors from Graphcore presented a poster at the workshop on Efficient Natural Language and Speech Processing. Their work considers the top-k operation, which finds the largest k items in a list, and investigates how it can be computed as quickly as possible on the parallel hardware commonly used to run AI applications. Top-k is found in many AI algorithms, so it’s useful to make it fast, e.g. a large language model might use it to select the most important parts of a long prompt. Their full paper is available here.
Varsha presented her research at the Workshop on Algorithmic Fairness through the Lens of Metrics and Evaluation (AFME 2024). This work addresses the challenge of evaluating classifier fairness when complete datasets, including protected attributes, are inaccessible. They propose a novel approach that leverages separate overlapping datasets, such as internal datasets lacking demographic information and external sources like census data, to construct synthetic test data with all necessary variables. Experiments demonstrate that our approach produces synthetic data with high fidelity and offers reliable fairness evaluation where real data is limited. Varsha says that she had an incredible experience attending her first NeurIPS and presenting her work and engaging in meaningful discussions throughout the conference was a deeply rewarding experience, providing invaluable feedback and ideas as the work extends. Do reach out to her if you’d like to learn more!