Variational Bayes (VB) applied to latent Dirichlet allocation (LDA) has become the most popular algorithm for aspect modeling. While sufficiently successful in text topic extraction from large corpora, VB is less successful in identifying aspects in the presence of limited data. We present a novel variational message passing algorithm as applied to Latent Dirichlet Allocation (LDA) and compare it with the gold standard VB and collapsed Gibbs sampling. In situations where marginalisation leads to non-conjugate messages, we use ideas from sampling to derive approximate update equations. In cases where conjugacy holds, Loopy Belief update (LBU) (also known as Lauritzen-Spiegelhalter) is used. Our algorithm, ALBU (approximate LBU), has strong similarities with Variational Message Passing (VMP) (which is the message passing variant of VB). To compare the performance of the algorithms in the presence of limited data, we use data sets consisting of tweets and news groups. Using coherence measures we show that ALBU learns latent distributions more accurately than does VB, especially for smaller data sets.
翻译:用于潜潜dirichlet分配(LDA) 的变异贝亚(VB) 已经成为最受欢迎的侧面建模算法。 虽然从大型公司提取的文本专题非常成功, VB在有限数据的情况下在识别方面不太成功。 我们展示了一种新的变异信息传动算法,适用于Litetent Dirichlet分配(LDA),并将其与金标准VB和崩溃的Gibbs抽样比较。 在边缘化导致非相容信息的情况下,我们使用取样中的想法来得出近似更新方程。 在同时持有同音的情况下, 使用LOopy Abusional更新(LBU)(也称为LUBE-Spiegelhalter ) 。 我们的算法, ALBU(接近LBU) 与VM(VMP)(这是VB的信息传动变量) 有很大的相似之处。 为了比较在有限数据存在情况下算法的性表现, 我们使用由推文和新闻组组成的数据集。 我们使用一致性措施显示ALBUB比VB更精确的隐性分布,, 特别是小数据集。