Professor Marc Deisenroth is the Google DeepMind Chair of Machine Learning and Artificial Intelligence at University College London, part of the UNESCO Chair on Artificial Intelligence at UCL, and Director of Science & Innovation—Grand Challenges (Environment & Sustainability) at The Alan Turing Institute. He also holds a visiting faculty position at the University of Johannesburg. Marc co-leads the Sustainability and Machine Learning Group at UCL. His research interests center around data-efficient machine learning, probabilistic modeling and autonomous decision making with applications in climate/weather science, nuclear fusion, and robotics.
Marc was Program Chair of EWRL 2012, Workshops Chair of RSS 2013, EXPO Chair at ICML 2020, Tutorials Chair at NeurIPS 2021, and Program Chair at ICLR 2022. He is an elected member of the ICML Board. He received Paper Awards at ICRA 2014, ICCAS 2016, ICML 2020, AISTATS 2021, and FAccT 2023. In 2019, Marc co-organized the Machine Learning Summer School in London.
In 2018, Marc received The President’s Award for Outstanding Early Career Researcher at Imperial College. He is a recipient of a Google Faculty Research Award and a Microsoft PhD Grant.
In 2018, Marc spent four months at the African Institute for Mathematical Sciences (Rwanda), where he taught a course on Foundations of Machine Learning as part of the African Masters in Machine Intelligence. He is co-author of the book Mathematics for Machine Learning, published by Cambridge University Press.
Machine Learning: Data-efficient machine learning, Gaussian processes, reinforcement learning, Bayesian optimization, approximate inference, deep probabilistic models, geo-spatial models
Robotics and Control: Robot learning, legged locomotion, planning under uncertainty, imitation learning, adaptive control, robust control, learning control, optimal control
Signal Processing: Nonlinear state estimation, Kalman filtering, time-series modeling, dynamical systems, system identification, stochastic information processing
Estimating the latent state of a dynamical system based on noisy observations is common challenge underlying many task in engineering, robotics, or climate science. Classical approaches to state estimation include Kalman filtering/smoothing, which is provably optimal in linear-Gaussian dynamical systems. In nonlinear systems, we need to resort to approximations, such as the extended or unscented Kalman smoothers (EKS, UKS), which will find a Gaussian representation of the latent posterior. This posterior can be arbitrarily bad. In this talk, I will provide a machine learning perspective on state estimation, and formulate it as a message passing problem. Message passing will not only allow for distributed and asynchronous updates of the latent-state posterior, but it also allows us to iteratively refine that posterior. We can show that message passing using approximate expectation propagation is identical to classical Gaussian filters for nonlinear systems, such as the EKS/UKS, in its very first iteration. However, with its mechanism to iteratively refine the posterior distribution, we typically can significantly improve on the solution provided by the classical EKS/UKS.
Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn low-dimensional embeddings of the inputs that explain the output data. Following the architecture of deep neural networks, the most common deep GPs warp the input space layer-by-layer but lose all the interpretability of shallow GPs. An alternative construction is to successively parameterize the lengthscale of a kernel, improving the interpretability but ultimately giving away the notion of learning lower-dimensional embeddings. Unfortunately, both methods are susceptible to particular pathologies which may hinder fitting and limit their interpretability. This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP). Each TDGP layer defines locally linear transformations of the original input data maintaining the concept of latent embeddings while also retaining the interpretation of lengthscales of a kernel. Moreover, unlike the prior solutions, TDGP induces non-pathological manifolds that admit learning lower-dimensional representations. We show with theoretical and experimental results that i) TDGP is, unlike previous models, tailored to specifically discover lower-dimensional manifolds in the input data, ii) TDGP behaves well when increasing the number of layers, and iii) TDGP performs well in standard benchmark datasets.
Neural ODEs demonstrate strong performance in generative and time-series modelling. However, training them via the adjoint method is slow compared to discrete models due to the requirement of numerically solving ODEs. To speed neural ODEs up, a common approach is to regularise the solutions. However, this approach may affect the expressivity of the model; when the trajectory itself matters, this is particularly important. In this paper, we propose an alternative way to speed up the training of neural ODEs. The key idea is to speed up the adjoint method by using Gauß-Legendre quadrature to solve integrals faster than ODE-based methods while remaining memory efficient. We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters. Our approach leads to faster training of neural ODEs, especially for large models. It also presents a new way to train SDE-based models.
With the advent of large datasets, offline reinforcement learning is a promising framework for learning good decision-making policies without the need to interact with the real environment. However, offline RL requires the dataset to be reward-annotated, which presents practical challenges when reward engineering is difficult or when obtaining reward annotations is labor-intensive. In this paper, we introduce Optimal Transport Relabeling (OTR), an imitation learning algorithm that can automatically relabel offline data of mixed and unknown quality with rewards from a few good demonstrations. OTR’s key idea is to use optimal transport to compute an optimal alignment between an unlabeled trajectory in the dataset and an expert demonstration to obtain a similarity measure that can be interpreted as a reward, which can then be used by an offline RL algorithm to learn the policy. OTR is easy to implement and computationally efficient. On D4RL benchmarks, we demonstrate that OTR with a single demonstration can consistently match the performance of offline RL with ground-truth rewards.
A typical criticism of Gaussian processes is their unfavourable scaling in both compute and memory requirements. Sparse variational Gaussian processes based on inducing variables are commonly used to scale Gaussian processes to large dataset sizes; their inherent compute and memory requirements are dominated by the number of inducing variables used. However, in practise sparse GPs are still limited by the number of datapoints and the number of inducing points one can use to perform matrix operations, making it again challenging to model large complex datasets. In this work we propose a new class of inter-domain variational GP, constructed by projecting the GP onto a set of compactly supported B-Spline basis functions. The key benefit of our approach is that the compact support of the B-Spline basis admits the use of sparse linear algebra to significantly speed up matrix operations and drastically reduce the memory footprint.
Bayesian inference in non-linear dynamical systems seeks to find good posterior approximations of a latent state given a sequence of observations. Gaussian filters and smoothers, including the (extended/unscented) Kalman filter/smoother, which are commonly used in engineering applications, yield Gaussian posteriors on the latent state. While they are computationally efficient, they are often criticised for their crude approximation of the posterior state distribution. In this paper, we address this criticism by proposing a message passing scheme for iterative state estimation in non-linear dynamical systems, which yields more informative (Gaussian) posteriors on the latent states. Our message passing scheme is based on expectation propagation (EP). We prove that classical Rauch–Tung–Striebel (RTS) smoothers, such as the extended Kalman smoother (EKS) or the unscented Kalman smoother (UKS), are special cases of our message passing scheme. Running the message passing scheme more than once can lead to significant improvements of the classical RTS smoothers, so that more informative state estimates can be obtained. We address potential convergence issues of EP by generalising our state estimation framework to damped updates and the consideration of general alpha-divergences.