Latent Dirichlet Allocation (LDA) is a probabilistic model used to uncover latent topics in a corpus of documents. Inference is often performed using variational Bayes (VB) algorithms, which calculate a lower bound to the posterior distribution over the parameters. Deriving the variational update equations for new models requires considerable manual effort; variational message passing (VMP) has emerged as a "black-box" tool to expedite the process of variational inference. But applying VMP in practice still presents subtle challenges, and the existing literature does not contain the steps that are necessary to implement VMP for the standard smoothed LDA model, nor are available black-box probabilistic graphical modelling software able to do the word-topic updates necessary to implement LDA. In this paper, we therefore present a detailed derivation of the VMP update equations for LDA. We see this as a first step to enabling other researchers to calculate the VMP updates for similar graphical models.
翻译:延迟的 Dirichlet分配( LDA) 是用于在文件堆中发现潜在主题的概率模型。 通常使用变式贝耶( VB) 算法来进行推论, 该算法计算出在参数上后方分布的下界。 新模型的变式更新方程式需要大量人工操作; 变式信息传递( VMP) 已成为加速变式推断过程的“ 黑箱” 工具。 但在实践中应用 VMP 仍然带来微妙的挑战, 现有文献没有包含执行标准平滑 LDA 模型的 VMP 所需的步骤, 也没有可用的黑盒概率图形建模软件, 能够执行 LDA 所需的词式更新 。 因此, 我们在此文件中详细介绍了 VMP 更新方程式的LDA 。 我们将此视为使其他研究人员能够计算类似图形模型的 VMP 更新的第一个步骤 。