命名无标签数据实体识别:薄弱监督办法 (Named Entity Recognition without Labelled Data: A Weak Supervision Approach) - 专知论文

会员服务 ·

0

entity · 目标领域 · 命名实体识别 · 标注 · MoDELS ·

2020 年 4 月 30 日

Named Entity Recognition without Labelled Data: A Weak Supervision Approach

翻译：命名无标签数据实体识别:薄弱监督办法

Pierre Lison,Aliaksandr Hubin,Jeremy Barnes,Samia Touileb

from arxiv, Accepted to ACL 2020 (long paper)

Named Entity Recognition (NER) performance often degrades rapidly when applied to target domains that differ from the texts observed during training. When in-domain labelled data is available, transfer learning techniques can be used to adapt existing NER models to the target domain. But what should one do when there is no hand-labelled data for the target domain? This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision. The approach relies on a broad spectrum of labelling functions to automatically annotate texts from the target domain. These annotations are then merged together using a hidden Markov model which captures the varying accuracies and confusions of the labelling functions. A sequence labelling model can finally be trained on the basis of this unified annotation. We evaluate the approach on two English datasets (CoNLL 2003 and news articles from Reuters and Bloomberg) and demonstrate an improvement of about 7 percentage points in entity-level $F_1$ scores compared to an out-of-domain neural NER model.

翻译：命名实体识别(NER) 性能在应用到与培训期间所观察到的文本不同的目标领域时,通常会迅速降低。当有内部标签数据时,可以使用转让学习技术将现有的净化模型调整到目标领域。但是,当目标领域没有手工标签数据时,人们应该做些什么?本文提出了一个简单而有力的方法,在没有标签数据的情况下,通过薄弱的监督来学习净化模型。这种方法依靠广泛的标签功能来自动批注目标领域的文本。然后将这些说明合并在一起,使用隐藏的Markov 模型来捕捉标签功能的不同精度和混乱。一个序列标签模型最终可以在这一统一说明的基础上接受培训。我们评估了两个英国数据集(CONNLL(2003年)和路透社和布隆伯格的新闻文章)上的方法,并表明与外部神经净化模型相比,实体一级1美元分数约7个百分点的改进。

1

相关内容

entity

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

76+阅读 · 2020年5月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Speech2Action: Cross-modal Supervision for Action Recognition

Speech2Action: Cross-modal Supervision for Action Recognition

Arxiv

7+阅读 · 2020年3月30日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Multi-Grained Named Entity Recognition

Multi-Grained Named Entity Recognition

Arxiv

6+阅读 · 2019年6月20日

Dynamic Transfer Learning for Named Entity Recognition

Dynamic Transfer Learning for Named Entity Recognition

Arxiv

5+阅读 · 2019年5月1日

Deep Metric Transfer for Label Propagation with Limited Annotated Data

Arxiv

3+阅读 · 2018年12月20日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

3+阅读 · 2018年11月15日

Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots

Arxiv

4+阅读 · 2018年5月7日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

Arxiv

10+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

命名实体识别

相关VIP内容

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

76+阅读 · 2020年5月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Speech2Action: Cross-modal Supervision for Action Recognition

Speech2Action: Cross-modal Supervision for Action Recognition

Arxiv

7+阅读 · 2020年3月30日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Multi-Grained Named Entity Recognition

Multi-Grained Named Entity Recognition

Arxiv

6+阅读 · 2019年6月20日

Dynamic Transfer Learning for Named Entity Recognition

Dynamic Transfer Learning for Named Entity Recognition

Arxiv

5+阅读 · 2019年5月1日

Deep Metric Transfer for Label Propagation with Limited Annotated Data

Arxiv

3+阅读 · 2018年12月20日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

3+阅读 · 2018年11月15日

Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots

Arxiv

4+阅读 · 2018年5月7日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

Arxiv

10+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员