SetExpan: 通过上下文特征选择和组合排行扩展基于公司设置的扩展 (SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble) - 专知论文

会员服务 ·

0

entity · 特征选择 · 情景 · 秩 · 相似度 ·

2019 年 10 月 17 日

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble

翻译：SetExpan: 通过上下文特征选择和组合排行扩展基于公司设置的扩展

Jiaming Shen,Zeqiu Wu,Dongming Lei,Jingbo Shang,Xiang Ren,Jiawei Han

from arxiv, ECMLPKDD 2017 accepted

Corpus-based set expansion (i.e., finding the "complete" set of entities belonging to the same semantic class, based on a given corpus and a tiny set of seeds) is a critical task in knowledge discovery. It may facilitate numerous downstream applications, such as information extraction, taxonomy induction, question answering, and web search. To discover new entities in an expanded set, previous approaches either make one-time entity ranking based on distributional similarity, or resort to iterative pattern-based bootstrapping. The core challenge for these methods is how to deal with noisy context features derived from free-text corpora, which may lead to entity intrusion and semantic drifting. In this study, we propose a novel framework, SetExpan, which tackles this problem, with two techniques: (1) a context feature selection method that selects clean context features for calculating entity-entity distributional similarity, and (2) a ranking-based unsupervised ensemble method for expanding entity set based on denoised context features. Experiments on three datasets show that SetExpan is robust and outperforms previous state-of-the-art methods in terms of mean average precision.

翻译：以 Corpus 为基础的集束扩展( 即找到属于同一语义类的实体的“ 完整” 组, 以给定体和种子组为基础) 是知识发现的关键任务。它可以促进许多下游应用, 如信息提取、分类上传、回答和网络搜索。要在扩大的集中发现新实体, 以往的方法或者根据分布相似性, 或者采用基于分布式的迭接式穿靴式。这些方法的核心挑战是如何处理来自自由文本 Corbora 的噪音背景特征, 这可能导致实体的入侵和语义漂移。在本研究中, 我们提出了一个新颖的框架, 即SetExpan, 解决了这个问题, 采用两种技术:(1) 环境特征选择方法, 选择用于计算实体实体- 实体分布相似性的清洁环境特征, (2) 以基于非注意环境特征的、以排序为基础的扩展实体的不超标的共性方法。三个数据集的实验显示SetExptaan 是稳健且超越了先前的平均精确度方法。

3

相关内容

entity

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

专知会员服务

35+阅读 · 2020年4月30日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【百度研究院】2020年10大人工智能科技趋势，Baidu Research: 10 Technological Trends in 2020

专知会员服务

33+阅读 · 2019年12月23日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

46+阅读 · 2019年11月24日

《中国大数据与实体经济融合发展白皮书》（2019版），44页PDF，中国信息通信研究院编

《中国大数据与实体经济融合发展白皮书》（2019版），44页PDF，中国信息通信研究院编

专知会员服务

72+阅读 · 2019年11月9日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

ERNIE: Enhanced Language Representation with Informative Entities

Arxiv

5+阅读 · 2019年5月17日

Embedding Logical Queries on Knowledge Graphs

Embedding Logical Queries on Knowledge Graphs

Arxiv

3+阅读 · 2019年2月19日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

Automatic multi-objective based feature selection for classification

Automatic multi-objective based feature selection for classification

Arxiv

6+阅读 · 2018年7月9日

Unsupervised Meta-Learning for Reinforcement Learning

Arxiv

8+阅读 · 2018年6月12日

Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories

Arxiv

3+阅读 · 2018年4月23日

SQL-Rank: A Listwise Approach to Collaborative Ranking

Arxiv

6+阅读 · 2018年2月28日

Knowledge Graph Embedding with Multiple Relation Projections

Arxiv

4+阅读 · 2018年1月26日

A Resource-Light Method for Cross-Lingual Semantic Textual Similarity

Arxiv

3+阅读 · 2018年1月19日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

专知会员服务

35+阅读 · 2020年4月30日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【百度研究院】2020年10大人工智能科技趋势，Baidu Research: 10 Technological Trends in 2020

专知会员服务

33+阅读 · 2019年12月23日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

46+阅读 · 2019年11月24日

《中国大数据与实体经济融合发展白皮书》（2019版），44页PDF，中国信息通信研究院编

《中国大数据与实体经济融合发展白皮书》（2019版），44页PDF，中国信息通信研究院编

专知会员服务

72+阅读 · 2019年11月9日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

ERNIE: Enhanced Language Representation with Informative Entities

Arxiv

5+阅读 · 2019年5月17日

Embedding Logical Queries on Knowledge Graphs

Embedding Logical Queries on Knowledge Graphs

Arxiv

3+阅读 · 2019年2月19日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

Automatic multi-objective based feature selection for classification

Automatic multi-objective based feature selection for classification

Arxiv

6+阅读 · 2018年7月9日

Unsupervised Meta-Learning for Reinforcement Learning

Arxiv

8+阅读 · 2018年6月12日

Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories

Arxiv

3+阅读 · 2018年4月23日

SQL-Rank: A Listwise Approach to Collaborative Ranking

Arxiv

6+阅读 · 2018年2月28日

Knowledge Graph Embedding with Multiple Relation Projections

Arxiv

4+阅读 · 2018年1月26日

A Resource-Light Method for Cross-Lingual Semantic Textual Similarity

Arxiv

3+阅读 · 2018年1月19日

微信扫码咨询专知VIP会员