Variational Bayes (VB) applied to latent Dirichlet allocation (LDA) has become the most popular algorithm for aspect modeling. While sufficiently successful in text topic extraction from large corpora, VB is less successful in identifying aspects in the presence of limited data. We present a novel variational message passing algorithm as applied to Latent Dirichlet Allocation (LDA) and compare it with the gold standard VB and collapsed Gibbs sampling. In situations where marginalisation leads to non-conjugate messages, we use ideas from sampling to derive approximate update equations. In cases where conjugacy holds, Loopy Belief update (LBU) (also known as Lauritzen-Spiegelhalter) is used. Our algorithm, ALBU (approximate LBU), has strong similarities with Variational Message Passing (VMP) (which is the message passing variant of VB). To compare the performance of the algorithms in the presence of limited data, we use data sets consisting of tweets and news groups. Additionally, to perform more fine grained evaluations and comparisons, we use simulations that enable comparisons with the ground truth via Kullback-Leibler divergence (KLD). Using coherence measures for the text corpora and KLD with the simulations we show that ALBU learns latent distributions more accurately than does VB, especially for smaller data sets.
翻译:用于潜潜潜 Dirichlet分配(LDA) 的变异贝亚(VB) 已成为最流行的模型模型算法。 虽然从大公司提取的文本主题在从大公司提取文本专题方面非常成功, VB在有限数据的情况下在识别方面不太成功。 我们展示了一种新的变异信息传动算法,适用于Lient Dirichlet分配(LDA),并将其与金标准VB和倒闭的Gibbs抽样比较。 在边缘化导致非兼容信息的情况下,我们使用取样中的想法来得出近似更新方程式。 在同时使用同质数据的情况下,我们使用Loopy Liision更新(LBU) (LBU) (也称为Lauritizen-Spiegelhalter ) (LBU) (LBU) (LB) (也称“LBU) (LB) (LB) (LB) (LP) (VMP) (VMP) (VMP) 应用新的变异信息传递算算算算算法(VBBBE) (这是VBBE 的变异变量变变的变换版本。为了比较算法在有限的数据中,我们使用更精确的数据, 我们使用由推推推量和新闻组进行更精确的模拟的变相比的变法数据。