Variance reduction properties of the reparameterization trick

Bayesian inference for complex statistical models is computationally challenging since the posterior distribution, i.e. the distribution of the unknown model parameters given the observed data, is intractable. Variational Inference (VI) aims to approximate the intractable posterior distribution by a member in a family of more tractable distributions. This is achieved by formulating an optimization objective that, when minimized, guarantees that the chosen member in the tractable family is the most adequate representation (within the approximating family) of the true posterior distribution. The statistician then conducts the rest of the analysis based on the so-called variational approximation. VI is a powerful and scalable technique that has found numerous applications, for example in image generation and spatio-temporal ecological models. 

The main challenge when implementing VI is that the optimization often needs to be solved by simulation, which involves Monte Carlo methods for estimating the gradient of the objective function to optimize. The gradient is hard to estimate efficiently with the standard method for gradient estimation. This has led researchers to develop the so-called reparameterization trick [1] for variance reduction, resulting in a more stable optimization. However, the current literature does not give a proper explanation of why the trick was so effective. In fact, no mathematical proofs existed showing that the trick actually reduced the variance.

ACEMS researchers have now provided a mathematical treatment that sheds light on the variance reduction properties of the reparameterization trick [2]. Under simplifying assumptions, they proved that the trick reduces the variance compared to the standard method. However, they also showed that there exists cases where the trick has the opposite effect. The work was presented in The 22nd International Conference on Artificial Intelligence and Statistics (AISTATS) in Okinawa, Japan in April 2019, and was subsequently published in the conference proceedings. AISTATS is considered a top conference in Machine Learning.

"Publishing in AISTATS is quite an achievement, especially considering that the lead author, Ming Xu, was a Master student when the research was conducted", says Dr. Matias Quiroz, an Associate Investigator of ACEMS who co-authored the work jointly with Chief Investigators Prof. Robert Kohn and Prof. Scott Sisson. The work was recently cited in a draft of a review article on Monte Carlo methods [3] for gradient estimation, authored by leading scholars from Google's DeepMind. 

  • [1] Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. In The 2nd International Conference on Learning Representations. 
  • [2] Xu, M., Quiroz, M., Kohn, R., & Sisson, S. A. (2019). Variance reduction properties of the reparameterization trick. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2711-2720).
  • [3] Mohamed, S., Rosca, M., Figurnov, M., & Mnih, A. (2019). Monte Carlo Gradient Estimation in Machine Learning. arXiv preprint arXiv:1906.10652.

Project Researchers