通过检索与相关采样有关的采样 (Domain-Specific NER via Retrieving Correlated Samples) - 专知论文

会员服务 ·

0

相关系数 · 命名实体识别 · 样本 · entity · MoDELS ·

2022 年 9 月 7 日

Domain-Specific NER via Retrieving Correlated Samples

翻译：通过检索与相关采样有关的采样

Xin Zhang,Yong Jiang,Xiaobin Wang,Xuming Hu,Yueheng Sun,Pengjun Xie,Meishan Zhang

from arxiv, Accepted by COLING 2022

Successful Machine Learning based Named Entity Recognition models could fail on texts from some special domains, for instance, Chinese addresses and e-commerce titles, where requires adequate background knowledge. Such texts are also difficult for human annotators. In fact, we can obtain some potentially helpful information from correlated texts, which have some common entities, to help the text understanding. Then, one can easily reason out the correct answer by referencing correlated samples. In this paper, we suggest enhancing NER models with correlated samples. We draw correlated samples by the sparse BM25 retriever from large-scale in-domain unlabeled data. To explicitly simulate the human reasoning process, we perform a training-free entity type calibrating by majority voting. To capture correlation features in the training stage, we suggest to model correlated samples by the transformer-based multi-instance cross-encoder. Empirical results on datasets of the above two domains show the efficacy of our methods.

翻译：成功的机械学习基于命名实体识别模型可能在某些特殊领域的文本上失败,例如中国地址和电子商务名称,这些文本需要适当的背景知识。这些文本对于人类说明者来说也是困难的。事实上,我们可以从相关文本中获取一些可能有用的信息,这些文本有一些共同实体,有助于理解文本。然后,人们可以通过引用相关样本来很容易地解释正确的答案。在本文中,我们建议用相关样本来强化净化模型。我们从大型域域内无标签的大型数据中从稀疏的BM25检索器中抽取相关样本。为了明确模拟人类推理过程,我们用多数选票进行无培训实体类型校准。为了在培训阶段捕捉相关特征,我们建议用基于变压器的多连锁交叉编码模型来模拟相关样本。以上两个域数据集的实证结果显示了我们方法的功效。

0

相关内容

相关系数

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

内质网应激和线粒体通路交联介导微囊藻毒素致斑马鱼雄性生殖细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

GOCE引力梯度数据的时间序列分析与误差处理

国家自然科学基金

0+阅读 · 2013年12月31日

LMP1诱导SATB1表达及磷酸化在鼻咽癌细胞上皮间叶转化中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-146a靶向IRAK1与TRAF6调控非小细胞肺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

雌激素通过ERα介导lncRNA 1200076调节卵巢ERα（+）细胞生物学行为

国家自然科学基金

0+阅读 · 2012年12月31日

15-kDa硒蛋白在内质网应激（ERS）和阿尔茨海默病(AD)中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

有机磷杀虫剂对中华稻蝗超氧化物歧化酶基因的转录调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向阻断Tim-3/Galectin-9信号介导的免疫逃逸抑制黑色素瘤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

电磁脉冲调控T淋巴细胞迁移的作用及分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

PP2A介导的组蛋白去磷酸化对DNA损伤修复的调控

国家自然科学基金

0+阅读 · 2009年12月31日

FedDebias: Reducing the Local Learning Bias Improves Federated Learning on Heterogeneous Data

Arxiv

0+阅读 · 2022年10月20日

A Deep Learning based No-reference Quality Assessment Model for UGC Videos

Arxiv

0+阅读 · 2022年10月20日

Improving generalizability of distilled self-supervised speech processing models under distorted settings

Arxiv

0+阅读 · 2022年10月20日

Deep-based quality assessment of medical images through domain adaptation

Arxiv

0+阅读 · 2022年10月19日

Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples

Arxiv

0+阅读 · 2022年10月19日

Domain-Specific Risk Minimization for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2022年10月18日

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

Arxiv

0+阅读 · 2022年10月18日

Extensible Proxy for Efficient NAS

Arxiv

0+阅读 · 2022年10月17日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

VIP会员

文章信息

相关主题

命名实体识别

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体化人工智能：架构、应用及未来发展方向的综合综述

《自主武器》365页书籍

联邦学习综述：多层次聚合技术的系统分类、实验洞察与未来前沿

人工智能在空战中的局限及其真正适用领域

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

FedDebias: Reducing the Local Learning Bias Improves Federated Learning on Heterogeneous Data

Arxiv

0+阅读 · 2022年10月20日

A Deep Learning based No-reference Quality Assessment Model for UGC Videos

Arxiv

0+阅读 · 2022年10月20日

Improving generalizability of distilled self-supervised speech processing models under distorted settings

Arxiv

0+阅读 · 2022年10月20日

Deep-based quality assessment of medical images through domain adaptation

Arxiv

0+阅读 · 2022年10月19日

Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples

Arxiv

0+阅读 · 2022年10月19日

Domain-Specific Risk Minimization for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2022年10月18日

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

Arxiv

0+阅读 · 2022年10月18日

Extensible Proxy for Efficient NAS

Arxiv

0+阅读 · 2022年10月17日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

相关基金

内质网应激和线粒体通路交联介导微囊藻毒素致斑马鱼雄性生殖细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

GOCE引力梯度数据的时间序列分析与误差处理

国家自然科学基金

0+阅读 · 2013年12月31日

LMP1诱导SATB1表达及磷酸化在鼻咽癌细胞上皮间叶转化中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-146a靶向IRAK1与TRAF6调控非小细胞肺癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

雌激素通过ERα介导lncRNA 1200076调节卵巢ERα（+）细胞生物学行为

国家自然科学基金

0+阅读 · 2012年12月31日

15-kDa硒蛋白在内质网应激（ERS）和阿尔茨海默病(AD)中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

有机磷杀虫剂对中华稻蝗超氧化物歧化酶基因的转录调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向阻断Tim-3/Galectin-9信号介导的免疫逃逸抑制黑色素瘤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

电磁脉冲调控T淋巴细胞迁移的作用及分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

PP2A介导的组蛋白去磷酸化对DNA损伤修复的调控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员