Over-parameterized models, such as DeepNets and ConvNets, form a class of models that are routinely adopted in a wide variety of applications, and for which Bayesian inference is desirable but extremely challenging. Variational inference offers the tools to tackle this challenge in a scalable way and with some degree of flexibility on the approximation, but for over-parameterized models this is challenging due to the over-regularization property of the variational objective. Inspired by the literature on kernel methods, and in particular on structured approximations of distributions of random matrices, this paper proposes Walsh-Hadamard Variational Inference (WHVI), which uses Walsh-Hadamard-based factorization strategies to reduce the parameterization and accelerate computations, thus avoiding over-regularization issues with the variational objective. Extensive theoretical and empirical analyses demonstrate that WHVI yields considerable speedups and model reductions compared to other techniques to carry out approximate inference for over-parameterized models, and ultimately show how advances in kernel methods can be translated into advances in approximate Bayesian inference.
翻译:诸如深网和ConvNets等超临界模型构成一组模型,通常在各种应用中采用,而且贝耶斯推断是可取的,但极具挑战性。 变相推断提供了工具,以可缩放的方式应对这一挑战,并在近似方面具有一定程度的灵活性,但对于超临界模型,由于变异目标的超常规特性,这具有挑战性。受内核方法文献的启发,特别是随机矩阵分布结构近似的启发,本文提出了沃尔什-哈达马尔德变异推断(WHWVI),它利用沃尔什-哈达马尔德的参数化战略来减少参数化和加速计算,从而避免与变异目标的超常规化问题。广泛的理论和实证分析表明,与对超临界模型进行近似推导的其他技术相比,HWVI产生相当大的速度和模型减幅,并最终表明内核方法的进展如何转化为近似贝约贝约误误值的推断。