ALBU:为LDA制作一个大致的LOopy Libision 信息传递算法,以改进小型数据集的性能 (ALBU: An approximate Loopy Belief message passing algorithm for LDA to improve performance on small data sets) - 专知论文

会员服务 ·

0

Performer · LDA · 潜在狄利克雷分配 · 近似 · 吉布斯采样/吉布斯抽样 ·

2021 年 10 月 1 日

ALBU: An approximate Loopy Belief message passing algorithm for LDA to improve performance on small data sets

翻译：ALBU:为LDA制作一个大致的LOopy Libision 信息传递算法,以改进小型数据集的性能

Rebecca M. C. Taylor,Johan A. du Preez

Variational Bayes (VB) applied to latent Dirichlet allocation (LDA) has become the most popular algorithm for aspect modeling. While sufficiently successful in text topic extraction from large corpora, VB is less successful in identifying aspects in the presence of limited data. We present a novel variational message passing algorithm as applied to Latent Dirichlet Allocation (LDA) and compare it with the gold standard VB and collapsed Gibbs sampling. In situations where marginalisation leads to non-conjugate messages, we use ideas from sampling to derive approximate update equations. In cases where conjugacy holds, Loopy Belief update (LBU) (also known as Lauritzen-Spiegelhalter) is used. Our algorithm, ALBU (approximate LBU), has strong similarities with Variational Message Passing (VMP) (which is the message passing variant of VB). To compare the performance of the algorithms in the presence of limited data, we use data sets consisting of tweets and news groups. Additionally, to perform more fine grained evaluations and comparisons, we use simulations that enable comparisons with the ground truth via Kullback-Leibler divergence (KLD). Using coherence measures for the text corpora and KLD with the simulations we show that ALBU learns latent distributions more accurately than does VB, especially for smaller data sets.

翻译：用于潜潜潜 Dirichlet分配(LDA) 的变异贝亚(VB) 已成为最流行的模型模型算法。虽然从大公司提取的文本主题在从大公司提取文本专题方面非常成功, VB在有限数据的情况下在识别方面不太成功。我们展示了一种新的变异信息传动算法,适用于Lient Dirichlet分配(LDA),并将其与金标准VB和倒闭的Gibbs抽样比较。在边缘化导致非兼容信息的情况下,我们使用取样中的想法来得出近似更新方程式。在同时使用同质数据的情况下,我们使用Loopy Liision更新(LBU) (LBU) (也称为Lauritizen-Spiegelhalter ) (LBU) (LBU) (LB) (也称“LBU) (LB) (LB) (LB) (LP) (VMP) (VMP) (VMP) 应用新的变异信息传递算算算算算法(VBBBE) (这是VBBE 的变异变量变变的变换版本。为了比较算法在有限的数据中,我们使用更精确的数据, 我们使用由推推推量和新闻组进行更精确的模拟的变相比的变法数据。

0

相关内容

Performer

【NAACL 2019 workshop】词汇和计算语义学联合会议 The 8th Joint Conference on Lexical and Computational Semantics ，犹他大学（The University of Utah）| Ellen Riloff，纽约大学| Sam Bowman

【NAACL 2019 workshop】词汇和计算语义学联合会议 The 8th Joint Conference on Lexical and Computational Semantics ，犹他大学（The University of Utah）| Ellen Riloff，纽约大学| Sam Bowman

专知会员服务

6+阅读 · 2019年12月5日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇主题模型相关论文—主题建模优化、变分推断、情绪强度、神经语言模型、搜索、社区聚合、主题建模的问题、光谱学习

【论文推荐】最新八篇主题模型相关论文—主题建模优化、变分推断、情绪强度、神经语言模型、搜索、社区聚合、主题建模的问题、光谱学习

专知

13+阅读 · 2018年3月8日

【论文推荐】最新6篇主题模型相关论文—正则化变分推断主题模型、非参数先验、在线聊天、词义消歧、神经语言模型

【论文推荐】最新6篇主题模型相关论文—正则化变分推断主题模型、非参数先验、在线聊天、词义消歧、神经语言模型

专知

6+阅读 · 2018年1月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

专知主题链路知识推荐#3——主题模型LDA Gibbs Sampling采样讲解

专知主题链路知识推荐#3——主题模型LDA Gibbs Sampling采样讲解

专知

13+阅读 · 2017年9月18日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Solving Large Steiner Tree Problems in Graphs for Cost-Efficient Fiber-To-The-Home Network Expansion

Arxiv

0+阅读 · 2021年11月24日

Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data

Arxiv

0+阅读 · 2021年11月23日

Trimmed Harrell-Davis quantile estimator based on the highest density interval of the given width

Arxiv

0+阅读 · 2021年11月23日

Approximate Bayesian Computation via Classification

Arxiv

0+阅读 · 2021年11月22日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Arxiv

7+阅读 · 2018年6月12日

Variational Inference In Pachinko Allocation Machines

Arxiv

8+阅读 · 2018年4月21日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

ADMM-based Networked Stochastic Variational Inference

Arxiv

3+阅读 · 2018年2月27日

VIP会员

文章信息

相关主题

潜在狄利克雷分配

吉布斯采样/吉布斯抽样

相关VIP内容

【NAACL 2019 workshop】词汇和计算语义学联合会议 The 8th Joint Conference on Lexical and Computational Semantics ，犹他大学（The University of Utah）| Ellen Riloff，纽约大学| Sam Bowman

【NAACL 2019 workshop】词汇和计算语义学联合会议 The 8th Joint Conference on Lexical and Computational Semantics ，犹他大学（The University of Utah）| Ellen Riloff，纽约大学| Sam Bowman

专知会员服务

6+阅读 · 2019年12月5日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS 2025】视觉指令瓶颈微调

什么是模块化开放系统方法（MOSA）？从美陆军新型倾转旋翼机视角解读

【牛津博士论文】面向视觉、物理与语言应用的可信机器学习模型

医学领域大型语言模型的新进展

相关资讯

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇主题模型相关论文—主题建模优化、变分推断、情绪强度、神经语言模型、搜索、社区聚合、主题建模的问题、光谱学习

【论文推荐】最新八篇主题模型相关论文—主题建模优化、变分推断、情绪强度、神经语言模型、搜索、社区聚合、主题建模的问题、光谱学习

专知

13+阅读 · 2018年3月8日

【论文推荐】最新6篇主题模型相关论文—正则化变分推断主题模型、非参数先验、在线聊天、词义消歧、神经语言模型

【论文推荐】最新6篇主题模型相关论文—正则化变分推断主题模型、非参数先验、在线聊天、词义消歧、神经语言模型

专知

6+阅读 · 2018年1月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

专知主题链路知识推荐#3——主题模型LDA Gibbs Sampling采样讲解

专知主题链路知识推荐#3——主题模型LDA Gibbs Sampling采样讲解

专知

13+阅读 · 2017年9月18日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Solving Large Steiner Tree Problems in Graphs for Cost-Efficient Fiber-To-The-Home Network Expansion

Arxiv

0+阅读 · 2021年11月24日

Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data

Arxiv

0+阅读 · 2021年11月23日

Trimmed Harrell-Davis quantile estimator based on the highest density interval of the given width

Arxiv

0+阅读 · 2021年11月23日

Approximate Bayesian Computation via Classification

Arxiv

0+阅读 · 2021年11月22日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Arxiv

7+阅读 · 2018年6月12日

Variational Inference In Pachinko Allocation Machines

Arxiv

8+阅读 · 2018年4月21日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

ADMM-based Networked Stochastic Variational Inference

Arxiv

3+阅读 · 2018年2月27日

微信扫码咨询专知VIP会员