Markov chain Monte Carlo (MCMC) algorithms are a powerful, computational tool for Bayesian inference.
Sampling as optimization in the space of probability measures
Many sampling algorithms inspired by diffusion processes, can in fact be viewed as noisy perturbations of gradient descent algorithms. This observation has spurred a recent wave of research connecting optimization and sampling. We are deepening this connection, by viewing MCMC algorithms as optimization algorithms in the space of probability measures.
W. Mou, N. Flammarion, M. J Wainwright, P. L Bartlett, Improved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity, Bernoulli 2021
Y-A Ma, N. Chatterji, X. Cheng, N. Flammarion, P. L Bartlett, M. I Jordan, Is there an analog of Nesterov acceleration for gradient-based MCMC? Bernoulli 2021
Y-A Ma, Y. Chen, C. Jin, N. Flammarion, M. I Jordan, Sampling can be faster than optimization, PNAS 2019
Large scale MCMC algorithms
To handle large datasets, stochastic-gradient Langevin dynamics, which uses stochastic gradients in lieu of full gradients to approximate a Langevin diffusion can be used. Variance-reduction techniques, developed in the setting of stochastic optimisation, to both achieve fast convergence and utilize cheaply-computed stochastic gradients can also be applied to MCMC algorithms.
NS Chatterji, N Flammarion, YA Ma, PL Bartlett, MI Jordan, On the theory of variance reduction for stochastic gradient Monte Carlo, ICML 2018