通过检索与相关采样有关的采样 (Domain-Specific NER via Retrieving Correlated Samples) - 专知论文

会员服务 ·

0

相关系数 · 命名实体识别 · 样本 · entity · MoDELS ·

2022 年 8 月 27 日

Domain-Specific NER via Retrieving Correlated Samples

翻译：通过检索与相关采样有关的采样

Xin Zhang,Yong Jiang,Xiaobin Wang,Xuming Hu,Yueheng Sun,Pengjun Xie,Meishan Zhang

from arxiv, Accepted by COLING 2022

Successful Machine Learning based Named Entity Recognition models could fail on texts from some special domains, for instance, Chinese addresses and e-commerce titles, where requires adequate background knowledge. Such texts are also difficult for human annotators. In fact, we can obtain some potentially helpful information from correlated texts, which have some common entities, to help the text understanding. Then, one can easily reason out the correct answer by referencing correlated samples. In this paper, we suggest enhancing NER models with correlated samples. We draw correlated samples by the sparse BM25 retriever from large-scale in-domain unlabeled data. To explicitly simulate the human reasoning process, we perform a training-free entity type calibrating by majority voting. To capture correlation features in the training stage, we suggest to model correlated samples by the transformer-based multi-instance cross-encoder. Empirical results on datasets of the above two domains show the efficacy of our methods.

翻译：成功的机械学习基于命名实体识别模型可能在某些特殊领域的文本上失败,例如中国地址和电子商务名称,这些文本需要适当的背景知识。这些文本对于人类说明者来说也是困难的。事实上,我们可以从相关文本中获取一些可能有用的信息,这些文本有一些共同实体,有助于理解文本。然后,人们可以通过引用相关样本来很容易地解释正确的答案。在本文中,我们建议用相关样本来强化净化模型。我们从大型域域内无标签的大型数据中从稀疏的BM25检索器中抽取相关样本。为了明确模拟人类推理过程,我们用多数选票进行无培训实体类型校准。为了在培训阶段捕捉相关特征,我们建议用基于变压器的多连锁交叉编码模型来模拟相关样本。以上两个域数据集的实证结果显示了我们方法的功效。

0

相关内容

相关系数

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

油菜叶色基因BnaC.HO1的功能分析及其突变导致叶色变异的机理

国家自然科学基金

0+阅读 · 2015年12月31日

Med25作为共激活因子对糖皮质激素受体GRα介导的CYP2C9的调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

混合数据中模糊语言知识挖掘方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于仿HDL纳米载体的肿瘤多靶点治疗及其光学成像研究

国家自然科学基金

0+阅读 · 2011年12月31日

PEDF34抑制前列腺癌血管新生和诱导肿瘤细胞凋亡的双重作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

水稻OsSLAC1的功能研究及抗逆转基因水稻植株的筛选

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graphs

Arxiv

0+阅读 · 2022年10月16日

Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge

Arxiv

0+阅读 · 2022年10月14日

How Many Events do You Need? Event-based Visual Place Recognition Using Sparse But Varying Pixels

Arxiv

0+阅读 · 2022年10月14日

How to Train Vision Transformer on Small-scale Datasets?

How to Train Vision Transformer on Small-scale Datasets?

Arxiv

0+阅读 · 2022年10月13日

Adversarial Style Augmentation for Domain Generalized Urban-Scene Segmentation

Arxiv

0+阅读 · 2022年10月13日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

21+阅读 · 2021年4月8日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Arxiv

23+阅读 · 2019年11月5日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

VIP会员

文章信息

相关主题

命名实体识别

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graphs

Arxiv

0+阅读 · 2022年10月16日

Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge

Arxiv

0+阅读 · 2022年10月14日

How Many Events do You Need? Event-based Visual Place Recognition Using Sparse But Varying Pixels

Arxiv

0+阅读 · 2022年10月14日

How to Train Vision Transformer on Small-scale Datasets?

How to Train Vision Transformer on Small-scale Datasets?

Arxiv

0+阅读 · 2022年10月13日

Adversarial Style Augmentation for Domain Generalized Urban-Scene Segmentation

Arxiv

0+阅读 · 2022年10月13日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

21+阅读 · 2021年4月8日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Arxiv

23+阅读 · 2019年11月5日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

相关基金

油菜叶色基因BnaC.HO1的功能分析及其突变导致叶色变异的机理

国家自然科学基金

0+阅读 · 2015年12月31日

Med25作为共激活因子对糖皮质激素受体GRα介导的CYP2C9的调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

混合数据中模糊语言知识挖掘方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于仿HDL纳米载体的肿瘤多靶点治疗及其光学成像研究

国家自然科学基金

0+阅读 · 2011年12月31日

PEDF34抑制前列腺癌血管新生和诱导肿瘤细胞凋亡的双重作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

水稻OsSLAC1的功能研究及抗逆转基因水稻植株的筛选

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员