Bayesian Methods for Big Data Sets and Intractable Likelihoods; Scalable Methods and Approaches

Lead CI: Tony Pettitt

There are many situations where analytically or computationally intractable likelihoods are encountered in a wide range of applications from stochastic models in genetics and biology through to spatial and temporal models in image analysis.

Some of the current approaches are analytic based on pseudo likelihoods or quasi likelihoods or numerically based on optimization such as EM algorithm-like methods or variational Bayes. Other methods are computer intensive based on Monte Carlo such as Markov chain Monte Carlo, Sequential Monte Carlo, Approximate Bayesian Computation (ABC) or indirect inference.

For the computer intensive approaches, parallel computing approaches are providing platforms to make computationally intensive Monte Carlo approaches feasible for large data sets. Sequential Monte Carlo provides an embarrassingly parallel approach.

Network or graphical data provides an example where there are computationally tractable models to simulate from but exact likelihood methods are intractable. Here, there is a need for approximate likelihood approximations based on summary statistics whose computation is scalable and the application of these approximations in Bayesian methods.

Investigations will include the following.

  • For ABC use of SMC and approaches based on making sampling and approximate likelihood proposals more efficient.
  • Scalable Matrix algebra methods using Gaussian process approximations
  • Use of indirect inference or approximating models and estimation of functions relating parameters to mean values of statistics.
  • Algorithm implementation in parallel computational environments.

Some of the specific applications and motivating examples include stochastic models of population dynamics from disease modelling and systems biology; spatial analysis and image analysis; network or graphical data; neurology with investigations involving upper and lower motor neurons, neuropathies and disease.