Bayesian Inference offers principled tools to tackle many critical problems with modern neural networks such as poor calibration and generalization, and data inefficiency. However, scaling Bayesian inference to large architectures is challenging and requires restrictive approximations. Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference and to estimate uncertainty with deep neural networks. Traditionally, the dropout mask is sampled independently from a fixed distribution. Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference. These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation. In this work, we propose GFlowOut to address these issues. GFlowOut leverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks. We empirically demonstrate that GFlowOut results in predictive distributions that generalize better to out-of-distribution data, and provide uncertainty estimates which lead to better performance in downstream tasks.
翻译:Bayesian Bayesian推论提供了解决现代神经网络中许多关键问题的原则性工具,如校准和概括性差以及数据效率低下。然而,将Bayesian推介到大型建筑中的做法具有挑战性,需要限制性近似。蒙特卡洛辍学被广泛用作一种相对廉价的方法,用于估算近似推理和深层神经网络的不确定性。传统上,辍学面具是独立于固定分布的样本。最近的工作表明,辍学面具可被视为潜在的变量,可以通过变式推论加以推导。这些方法面临两大挑战:(a) 后端面面罩的分布可能是高度多式的,可能难以与标准的变异推论相近;(b) 充分利用样本依赖的信息以及辍学面具之间的关联性来改进后端估计并非微不足道。在这项工作中,我们建议GFlowOut来解决这些问题。GFlowOut利用最近提出的Genealation FlowNet(GFlowNets)的不稳定性框架来学习后端分发后端口罩。我们的经验性地展示了下游数据的预测,以便改进GFOD-DRismilling提供更好的数据。