It is common practice to use Laplace approximations to compute marginal likelihoods in Bayesian versions of generalised linear models (GLM). Marginal likelihoods combined with model priors are then used in different search algorithms to compute the posterior marginal probabilities of models and individual covariates. This allows performing Bayesian model selection and model averaging. For large sample sizes, even the Laplace approximation becomes computationally challenging because the optimisation routine involved needs to evaluate the likelihood on the full set of data in multiple iterations. As a consequence, the algorithm is not scalable for large datasets. To address this problem, we suggest using a version of a popular batch stochastic gradient descent (BSGD) algorithm for estimating the marginal likelihood of a GLM by subsampling from the data. We further combine the algorithm with Markov chain Monte Carlo (MCMC) based methods for Bayesian model selection and provide some theoretical results on the convergence of the estimates. Finally, we report results from experiments illustrating the performance of the proposed algorithm.
翻译:通常的做法是使用Laplace近似值来计算贝伊西亚版本的通用线性模型(GLM)的边际可能性。然后,在不同的搜索算法中,将边际可能性与模型和单个共变体的边际概率结合起来,以计算模型和模型的边际概率。这样可以平均地进行巴伊西亚模式的选择和模型。对于大样本大小,即使是Laplace近近近似值也具有计算上的挑战性,因为优化的常规需要用多个迭代来评价全套数据的可能性。因此,该算法无法对大型数据集进行缩放。为了解决这个问题,我们建议使用一种流行的批量梯度梯度梯度下行算法(BSGD)来通过对数据进行子取样来估计GLM的边际可能性。我们进一步将算法与Markov 链 Monte Carlo(MC) 结合起来, 用于选择Bayesian 模型的方法,并提供有关估计数汇合的一些理论结果。最后,我们报告通过实验得出拟议算法的绩效。