## University College London (COMP0168)

This course is designed to introduce students to “trending” topics within the last five years as represented in international machine learning conferences. The backbone of the course will be a series of tutorial-style introductory lectures on a given set of selected topics. This will be supplemented by seminar-style course work, where current research papers are read in common, reviewed, discussed, and presented.

### Syllabus (preliminary)

- Gaussian Processes (please use Acrobat Reader for the animations) — Marc Deisenroth
- Bayesian Optimization — Marc Deisenroth
- Bayesian Deep Learning — Brooks Paige
- Integration in Machine Learning — Marc Deisenroth
- Meta Learning — Brooks Paige

### Delivery

The course will be delivered (at least partially) online. Lecture recordings will be available for viewing at home. We will have live Q&A in allocated time slots, if possible on campus.

### Teaching Assistants

- Yicheng Luo
- Mirgahney H. Mohamed
- Eric-Tuan Le

### Resources

#### Gaussian Processes

- Multi-class Gaussian Process Classification with Noisy Inputs
- Differentially Private Regression and Classification with Sparse Gaussian Processes
- Latent Gaussian process with composite likelihoods and numerical quadrature
- Scalable Gaussian Process Variational Autoencoders
- Kernel Interpolation for Scalable Online Gaussian Processes
- Hierarchical Inducing Point Gaussian Process for Inter-domian Observations
- MatÃ©rn Gaussian Processes on Graphs
- Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations
- Sparse Algorithms for Markovian Gaussian Processes
- Linearly Constrained Gaussian Processes with Boundary Conditions
- Multi-Fidelity High-Order Gaussian Processes for Physical Simulation
- Sparse within Sparse Gaussian Processes using Neighbor Information
- Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition
- On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes
- Bias-Free Scalable Gaussian Processes via Randomized Truncations
- Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes
- SigGPDE: Scaling Sparse Gaussian Processes on Sequential Data
- Gaussian Process-Based Real-Time Learning for Safety Critical Applications
- Variational Auto-Regressive Gaussian Processes for Continual Learning
- Isometric Gaussian Process Latent Variable Model for Dissimilarity Data
- Equivariant Learning of Stochastic Fields: Gaussian Processes and Steerable Conditional Neural Processes
- High-Dimensional Gaussian Process Inference with Derivatives
- Tighter Bounds on the Log Marginal Likelihood of Gaussian Process Regression Using Conjugate Gradients
- Skew Gaussian Processes for Classification
- Pathwise Conditioning of Gaussian Processes.
- Sparse Orthogonal Variational Inference for Gaussian Processes
- Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters
- Deep Gaussian Processes
- Healing Products of Gaussian Process Experts
- Variational Learning of Inducing Variables in Sparse Gaussian Processes
- Variational Fourier Features for Gaussian Processes
- Deep Kernel Learning
- Exact Gaussian Processes on a Million Data Points
- Infinite-Horizon Gaussian Processes
- A Unifying View of Sparse Approximate Gaussian Process Regression
- Approximations for Binary Gaussian Process Classification
- Gaussian Process Modulated Cox Processes under Linear Inequality Constraints
- Convolutional Gaussian Processes
- Conditional Neural Processes
- Inter-domain Gaussian Processes for Sparse Inference using Inducing Features
- Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models
- Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies
- Randomly Projected Additive Gaussian Processes for Regression
- Sparse Gaussian Processes with Spherical Harmonic Features
- Parametric Gaussian Process Regressors
- Inter-domain Deep Gaussian Processes
- State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes
- Gaussian Processes for Data-Efficient Learning in Robotics and Control
- Matern Gaussian Processes on Riemannian Manifolds
- Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences
- Learning Invariances using the Marginal Likelihood

#### Bayesian Optimization

- Faster & More Reliable Tuning of Neural Networks: Bayesian Optimization with Importance Sampling
- Scalable Constrained Bayesian Optimization
- BORE: Bayesian Optimization by Density-Ratio Estimation
- Collaborative Bayesian Optimization with Fair Regret
- Bias-Robust Bayesian Optimization via Dueling Bandits
- Bayesian Optimization over Hybrid Spaces
- Objective Bound Conditional Gaussian Process for Bayesian Optimization
- On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
- Lenient Regret and Good-Action Identification in Gaussian Process Bandits
- Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
- A General Framework for Constrained Bayesian Optimization using Information-based Search
- Bayesian Optimization with Inequality Constraints
- Bayesian Optimization in a Billion Dimensions via Random Embeddings
- Bayesian Optimization with Unknown Constraints
- Entropy Search for Information-Efficient Global Optimization
- An Efficient Approach for Assessing Hyperparameter Importance
- Scalable Bayesian Optimization Using Deep Neural Networks
- Freeze-Thaw Bayesian Optimization
- Maximizing Acquisition Functions for Bayesian Optimization
- Modulating Surrogates for Bayesian Optimization
- Projective Preferential Bayesian Optimization
- Multi-objective Bayesian Optimization using Pareto-frontier Entropy
- A General Framework for Multi-fidelity Bayesian Optimization with Gaussian Processes
- Multi-objective Bayesian optimisation with preferences over objectives
- Stagewise Safe Bayesian Optimization with Gaussian Processes

#### Bayesian Deep Learning

- Improving predictions of Bayesian neural nets via local linearization
- Bayesian Deep Learning and a Probabilistic Perspective of Generalization
- How Good is the Bayes Posterior in Deep Neural Networks Really?
- Deep Ensembles: A Loss Landscape Perspective
- Deep Neural Networks as Gaussian Processes
- A Scalable Laplace Approximation for Neural Networks
- Deterministic Variational Inference for Robust Bayesian Neural Networks
- Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks
- Can you trust your model’s uncertainty? Evaluating predictive uncertainty under dataset shift
- Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
- Deep Kernel Learning
- A Simple Baseline for Bayesian Uncertainty in Deep Learning
- Deep Evidential Regression
- Depth Uncertainty in Neural Networks
- Neural Tangent Kernel: Convergence and generalization in neural networks
- Bayesian Deep Ensembles via the Neural Tangent Kernel
- Deep Bayesian Active Learning with Image Data
- Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
- Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks
- Subspace inference for Bayesian deep learning
- Noisy Natural Gradient as Variational Inference

#### Integration in Machine Learning

- Trumpets: Injective flows for inference and inverse problems
- Normalizing Flows Across Dimensions
- Transforming Gaussian Processes With Normalizing Flows
- Evaluating the Implicit Midpoint Integrator for Riemannian Hamiltonian Monte Carlo
- Bayesian Quadrature on Riemannian Data Manifolds
- Composing Normalizing Flows for Inverse Problems
- Self Normalizing Flows
- Sliced Iterative Normalizing Flows
- Scalable Normalizing Flows for Permutation Invariant Densities
- Unscented Filtering and Nonlinear Estimation
- Normalizing Flows on Tori and Spheres
- Riemannian Continuous Normalizing Flows
- Probabilistic Numerics and Uncertainty in Computations
- Classical Quadrature Rules via Gaussian Processes
- Optimal Monte Carlo Integration on Closed Manifolds
- On the Relation Between Gaussian Process Quadratures and Sigma-Point Methods
- Frank-Wolfe Bayesian Quadrature: Probabilistic Integration with Theoretical Guarantees
- Bayesian quadrature for ratios
- Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature
- Bandit Based Monte-Carlo Planning
- Monte Carlo Gradient Estimation in Machine Learning
- Spatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering
- On Sequential Monte Carlo Sampling Methods for Bayesian Filtering
- Neural Importance Sampling
- Learning in Implicit Generative Models
- Neural Ordinary Differential Equations
- FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models
- How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization

#### Meta Learning

- Meta Learning in the Continuous Time Limit
- Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
- Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
- Meta-Learning Probabilistic Inference For Prediction
- Functional Regularisation for Continual Learning with Gaussian Processes
- Meta Reinforcement Learning with Latent Variable Gaussian Processes
- TaskNorm: Rethinking Batch Normalization for Meta-Learning
- Meta-learning with Stochastic Linear Bandits
- Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks
- Meta-Learning with Shared Amortized Variational Inference
- Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs
- On the Global Optimality of Model-Agnostic Meta-Learning
- Information-theoretic Meta Learning with Gaussian processes
- Meta-learning MCMC proposals
- Learning to Learn by Gradient Descent by Gradient Descent
- Bayesian Meta-Learning for the Few-Shot Setting via Deep Kernels
- On First-Order Meta-Learning Algorithms.
- Amortized Bayesian Meta Learning
- A closer look at few-shot classification
- How to train your MAML
- Modular Meta-Learning with Shrinkage
- OOD-MAML: Meta-Learning for Few-Shot Out-of-Distribution Detection and Classification
- Meta-Neighborhoods
- Meta-Learning Requires Meta-Augmentation
- Online Structured Meta-learning
- Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes
- Model-based Adversarial Meta-Reinforcement Learning
- Modeling and Optimization Trade-off in Meta-learning
- Look-ahead Meta Learning for Continual Learning