Approximate inference in deep Bayesian networks exhibits a dilemma of how to yield high fidelity posterior approximations while maintaining computational efficiency and scalability. We tackle this challenge by introducing a novel variational structured approximation inspired by the Bayesian interpretation of Dropout regularization. Concretely, we focus on the inflexibility of the factorized structure in Dropout posterior and then propose an improved method called Variational Structured Dropout (VSD). VSD employs an orthogonal transformation to learn a structured representation on the variational noise and consequently induces statistical dependencies in the approximate posterior. Theoretically, VSD successfully addresses the pathologies of previous Variational Dropout methods and thus offers a standard Bayesian justification. We further show that VSD induces an adaptive regularization term with several desirable properties which contribute to better generalization. Finally, we conduct extensive experiments on standard benchmarks to demonstrate the effectiveness of VSD over state-of-the-art variational methods on predictive accuracy, uncertainty estimation, and out-of-distribution detection.
翻译:深入的Bayesian网络的大致推论表明,如何在保持计算效率和可缩放性的同时产生高度忠诚的后近似值,是一个两难的难题。我们通过采用一种由Bayesian对辍学者正规化的解释所启发的新的变式结构近似值来应对这一挑战。具体地说,我们注重在辍学后继体中因子化结构结构结构不灵活,然后提出一种称为变式结构式结构脱落(VSD)的改良方法。VSD使用一个正方位转换方法来学习关于变异噪音的结构性代表,从而在近似后继体中产生统计依赖性。理论上,VSD成功地解决了以往挥发性辍学方法的病理学,从而提供了标准的Bayes解释理由。我们进一步表明,VSD引出了适应性规范化术语,有若干可取的属性,有助于更好地概括化。最后,我们对标准基准进行了广泛的实验,以证明VSD在预测准确性、不确定性估计和分配外检测方面对最新变异方法的有效性。