Gaussian variational approximation is a popular methodology to approximate posterior distributions in Bayesian inference especially in high dimensional and large data settings. To control the computational cost while being able to capture the correlations among the variables, the low rank plus diagonal structure was introduced in the previous literature for the Gaussian covariance matrix. For a specific Bayesian learning task, the uniqueness of the solution is usually ensured by imposing stringent constraints on the parameterized covariance matrix, which could break down during the optimization process. In this paper, we consider two special covariance structures by applying the Stiefel manifold and Grassmann manifold constraints, to address the optimization difficulty in such factorization architectures. To speed up the updating process with minimum hyperparameter-tuning efforts, we design two new schemes of Riemannian stochastic gradient descent methods and compare them with other existing methods of optimizing on manifolds. In addition to fixing the identification issue, results from both simulation and empirical experiments prove the ability of the proposed methods of obtaining competitive accuracy and comparable converge speed in both high-dimensional and large-scale learning tasks.
翻译:Gausian 变差近似近似值是一种常用的方法,用于在贝耶斯的推论中,特别是在高维和大数据设置中,近似后部分布。为了控制计算成本,同时能够捕捉变数之间的相互关系,在以前的文献中为戈萨共变矩阵采用了低等加对角结构。对于一个具体的巴耶斯学习任务,解决办法的独特性通常通过对参数化共变矩阵施加严格的限制来确保,这种限制在优化过程中可能会破裂。在本文中,我们考虑采用两个特殊的共变差结构,即施蒂费尔多重和格拉斯曼多重限制,以解决这种因子化结构中的优化困难。为了加快更新进程,我们用最低限度的超光量调整努力,我们设计了两种新办法,即里曼式的Stochacisic梯度脱底法方法,并将其与其他现有优化的多元方法进行比较。除了确定识别问题外,模拟和实验实验的结果证明,拟议的方法有能力在高维和大型学习任务中取得竞争性的准确性和可比的趋同速度。