等级主题存在模式 (Hierarchical Topic Presence Models) - 专知论文

会员服务 ·

0

话题模型 · MoDELS · 因子分析 · 话题 · 分解的 ·

2021 年 4 月 16 日

Hierarchical Topic Presence Models

翻译：等级主题存在模式

Jason Wang,Robert E. Weiss

Topic models analyze text from a set of documents. Documents are modeled as a mixture of topics, with topics defined as probability distributions on words. Inferences of interest include the most probable topics and characterization of a topic by inspecting the topic's highest probability words. Motivated by a data set of web pages (documents) nested in web sites, we extend the Poisson factor analysis topic model to hierarchical topic presence models for analyzing text from documents nested in known groups. We incorporate an unknown binary topic presence parameter for each topic at the web site and/or the web page level to allow web sites and/or web pages to be sparse mixtures of topics and we propose logistic regression modeling of topic presence conditional on web site covariates. We introduce local topics into the Poisson factor analysis framework, where each web site has a local topic not found in other web sites. Two data augmentation methods, the Chinese table distribution and P\'{o}lya-Gamma augmentation, aid in constructing our sampler. We analyze text from web pages nested in United States local public health department web sites to abstract topical information and understand national patterns in topic presence.

翻译：分析一组文件文本的专题模型; 文件是作为一组专题的混合体建模的,题目的定义是文字的概率分布; 引人注意的推论包括最可能的专题和通过检查专题的概率最高词对专题的定性; 受一组在网站上嵌入的网页(文件)数据集的驱动,我们把Poisson要素分析专题模型扩大到从已知群体嵌入的文件文本分析的分级专题存在模式; 我们为网站和(或)网页的每个专题增加了一个未知的二进制主题存在参数,使网站和(或)网页能够分散各种专题的组合; 我们建议以网站变量为条件,对专题存在进行后勤回归模型; 我们把地方专题引入Poisson要素分析框架, 在每个网站都有其他网站没有找到的本地专题; 两种数据增强方法,即中国表格分布和P\{o}lya-Gamma加增, 帮助构建我们的取样器。我们分析了美国公共卫生部网站嵌入网页的文本,以摘要主题信息为基础,并理解专题存在的国家模式。

0

相关内容

话题模型

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【经典书】C++编程：从问题分析到程序设计，1491页pdf

【经典书】C++编程：从问题分析到程序设计，1491页pdf

专知会员服务

65+阅读 · 2020年8月11日

实用信息安全管理，253页pdf，Practical Information Security Management

专知会员服务

25+阅读 · 2020年5月31日

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

专知会员服务

41+阅读 · 2020年2月26日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【贝叶斯规则因果推理】《Causal Inference with Bayes Rule》by Finn Lattimore, David Rohde

【贝叶斯规则因果推理】《Causal Inference with Bayes Rule》by Finn Lattimore, David Rohde

专知会员服务

48+阅读 · 2019年12月13日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【Strata Data Conference】用于自然语言处理的深度学习方法

【Strata Data Conference】用于自然语言处理的深度学习方法

专知会员服务

49+阅读 · 2019年9月23日

【RecSys 2019报告】食品推荐帮助健康减肥（Inspiring healthy habits: data science at WW），128页pdf

【RecSys 2019报告】食品推荐帮助健康减肥（Inspiring healthy habits: data science at WW），128页pdf

专知会员服务

9+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

人工智能 | PRICAI 2019等国际会议信息9条

人工智能 | PRICAI 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年12月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Hierarchical log Gaussian Cox process for regeneration in uneven-aged forests

Arxiv

0+阅读 · 2021年6月9日

Bayesian hierarchical modeling and analysis for physical activity trajectories using actigraph data

Arxiv

0+阅读 · 2021年6月8日

Hypothesis Testing for Hierarchical Structures in Cognitive Diagnosis Models

Arxiv

0+阅读 · 2021年6月6日

Seemingly Unrelated Multi-State processes: a Bayesian semiparametric approach

Arxiv

0+阅读 · 2021年6月6日

Hierarchical Bayesian Mixture Models for Time Series Using Context Trees as State Space Partitions

Arxiv

0+阅读 · 2021年6月6日

Generalized Universe Hierarchies and First-Class Universe Levels

Arxiv

0+阅读 · 2021年6月5日

Nonparametric Topic Modeling with Neural Inference

Arxiv

3+阅读 · 2018年6月18日

Topic Modeling on Health Journals with Regularized Variational Inference

Arxiv

3+阅读 · 2018年1月15日

Topic Compositional Neural Language Model

Arxiv

5+阅读 · 2017年12月29日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【经典书】C++编程：从问题分析到程序设计，1491页pdf

【经典书】C++编程：从问题分析到程序设计，1491页pdf

专知会员服务

65+阅读 · 2020年8月11日

实用信息安全管理，253页pdf，Practical Information Security Management

专知会员服务

25+阅读 · 2020年5月31日

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

专知会员服务

41+阅读 · 2020年2月26日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【贝叶斯规则因果推理】《Causal Inference with Bayes Rule》by Finn Lattimore, David Rohde

【贝叶斯规则因果推理】《Causal Inference with Bayes Rule》by Finn Lattimore, David Rohde

专知会员服务

48+阅读 · 2019年12月13日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【Strata Data Conference】用于自然语言处理的深度学习方法

【Strata Data Conference】用于自然语言处理的深度学习方法

专知会员服务

49+阅读 · 2019年9月23日

【RecSys 2019报告】食品推荐帮助健康减肥（Inspiring healthy habits: data science at WW），128页pdf

【RecSys 2019报告】食品推荐帮助健康减肥（Inspiring healthy habits: data science at WW），128页pdf

专知会员服务

9+阅读 · 2019年9月20日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

人工智能 | PRICAI 2019等国际会议信息9条

人工智能 | PRICAI 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年12月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Hierarchical log Gaussian Cox process for regeneration in uneven-aged forests

Arxiv

0+阅读 · 2021年6月9日

Bayesian hierarchical modeling and analysis for physical activity trajectories using actigraph data

Arxiv

0+阅读 · 2021年6月8日

Hypothesis Testing for Hierarchical Structures in Cognitive Diagnosis Models

Arxiv

0+阅读 · 2021年6月6日

Seemingly Unrelated Multi-State processes: a Bayesian semiparametric approach

Arxiv

0+阅读 · 2021年6月6日

Hierarchical Bayesian Mixture Models for Time Series Using Context Trees as State Space Partitions

Arxiv

0+阅读 · 2021年6月6日

Generalized Universe Hierarchies and First-Class Universe Levels

Arxiv

0+阅读 · 2021年6月5日

Nonparametric Topic Modeling with Neural Inference

Arxiv

3+阅读 · 2018年6月18日

Topic Modeling on Health Journals with Regularized Variational Inference

Arxiv

3+阅读 · 2018年1月15日

Topic Compositional Neural Language Model

Arxiv

5+阅读 · 2017年12月29日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员