Bayesian inference provides a methodology for parameter estimation and uncertainty quantification in machine learning and deep learning methods. Variational inference and Markov Chain Monte-Carlo (MCMC) sampling techniques are used to implement Bayesian inference. In the past three decades, MCMC methods have faced a number of challenges in being adapted to larger models (such as in deep learning) and big data problems. Advanced proposals that incorporate gradients, such as a Langevin proposal distribution, provide a means to address some of the limitations of MCMC sampling for Bayesian neural networks. Furthermore, MCMC methods have typically been constrained to use by statisticians and are still not prominent among deep learning researchers. We present a tutorial for MCMC methods that covers simple Bayesian linear and logistic models, and Bayesian neural networks. The aim of this tutorial is to bridge the gap between theory and implementation via coding, given a general sparsity of libraries and tutorials to this end. This tutorial provides code in Python with data and instructions that enable their use and extension. We provide results for some benchmark problems showing the strengths and weaknesses of implementing the respective Bayesian models via MCMC. We highlight the challenges in sampling multi-modal posterior distributions in particular for the case of Bayesian neural networks, and the need for further improvement of convergence diagnosis.
翻译:概述:贝叶斯推断为机器学习和深度学习方法中的参数估计和不确定性量化提供了一种方法。变分推断和马尔科夫链蒙特卡罗(MCMC)采样技术被用来实现贝叶斯推断。在过去的30年中,MCMC方法在适应更大模型(例如深度学习)和大型数据问题方面面临了许多挑战。进阶提案包括梯度,如Langevin提议分布,为解决贝叶斯神经网络的MCMC采样的一些限制提供了一种手段。此外,MCMC方法通常受到统计学家的限制,仍未被深度学习研究人员广泛采用。我们为MCMC方法提供了一个教程,涵盖了简单的贝叶斯线性和逻辑模型以及贝叶斯神经网络。本教程的目的是通过编码来弥合理论和实现之间的差距,考虑到现有的库和教程的稀缺性。本教程提供了Python代码,数据和指令,使其可以被使用和扩展。我们提供了一些基准问题的结果,展示了通过MCMC实现相应贝叶斯模型的优缺点。特别是贝叶斯神经网络的情况下,我们强调了多峰后验分布的采样和收敛诊断的进一步完善的挑战。