TMR:评估NER回顾尖锐的批评 (TMR: Evaluating NER Recall on Tough Mentions) - 专知论文

会员服务 ·

0

查全率/召回率 · Performer · entity · 命名实体识别 · 可辨认的 ·

2021 年 3 月 23 日

TMR: Evaluating NER Recall on Tough Mentions

翻译：TMR:评估NER回顾尖锐的批评

Jingxuan Tu,Constantine Lignos

from arxiv, To appear in the 2021 EACL Student Research Workshop (SRW)

We propose the Tough Mentions Recall (TMR) metrics to supplement traditional named entity recognition (NER) evaluation by examining recall on specific subsets of "tough" mentions: unseen mentions, those whose tokens or token/type combination were not observed in training, and type-confusable mentions, token sequences with multiple entity types in the test data. We demonstrate the usefulness of these metrics by evaluating corpora of English, Spanish, and Dutch using five recent neural architectures. We identify subtle differences between the performance of BERT and Flair on two English NER corpora and identify a weak spot in the performance of current models in Spanish. We conclude that the TMR metrics enable differentiation between otherwise similar-scoring systems and identification of patterns in performance that would go unnoticed from overall precision, recall, and F1.

翻译：我们建议采用 " 尖锐点评 " (TMR)衡量标准,以补充传统名称实体识别(NER)评估,方法是回顾 " 牙点 " 提到的具体子集:隐名提法,在培训中未观察到其象征性或象征性/类型组合的代号或代号/型号/型号可变提法,在测试数据中提及具有多个实体类型的象征性序列,我们用最近五个神经结构对英文、西班牙文和荷兰公司进行了评估,以表明这些衡量标准的作用。我们发现BERT和Flair在两个英国净化公司的表现之间存在细微差别,并找出目前西班牙模式表现中的一个薄弱环节。我们的结论是,TMR衡量标准可以区分其他相似的分级系统,并查明不注意整体精确性、回顾性和F1的性能模式。我们的结论是,TMR衡量标准可以区分其他相似的分级系统。

0

相关内容

查全率/召回率

查全率/召回率

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【ACL 2019 Tutorials】论据挖掘研究进展（Advances in Argument Mining）

【ACL 2019 Tutorials】论据挖掘研究进展（Advances in Argument Mining）

专知会员服务

16+阅读 · 2019年11月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

NLG任务评价指标BLEU与ROUGE

NLG任务评价指标BLEU与ROUGE

AINLP

21+阅读 · 2020年5月25日

ICML2019：Google和Facebook在推进哪些方向？

ICML2019：Google和Facebook在推进哪些方向？

中国人工智能学会

5+阅读 · 2019年6月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

Arxiv

0+阅读 · 2021年5月17日

Sentence Similarity Based on Contexts

Arxiv

1+阅读 · 2021年5月17日

Improving Vulnerability Prediction of JavaScript Functions Using Process Metrics

Arxiv

0+阅读 · 2021年5月16日

FLERT: Document-Level Features for Named Entity Recognition

Arxiv

1+阅读 · 2021年5月14日

Towards Human-Free Automatic Quality Evaluation of German Summarization

Arxiv

0+阅读 · 2021年5月13日

Generating Question Relevant Captions to Aid Visual Question Answering

Generating Question Relevant Captions to Aid Visual Question Answering

Arxiv

5+阅读 · 2019年9月9日

BERTScore: Evaluating Text Generation with BERT

Arxiv

5+阅读 · 2019年4月21日

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Arxiv

4+阅读 · 2018年7月4日

A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets

Arxiv

5+阅读 · 2018年2月14日

PEYMA: A Tagged Corpus for Persian Named Entities

Arxiv

5+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

查全率/召回率

命名实体识别

相关VIP内容

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【ACL 2019 Tutorials】论据挖掘研究进展（Advances in Argument Mining）

【ACL 2019 Tutorials】论据挖掘研究进展（Advances in Argument Mining）

专知会员服务

16+阅读 · 2019年11月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

NLG任务评价指标BLEU与ROUGE

NLG任务评价指标BLEU与ROUGE

AINLP

21+阅读 · 2020年5月25日

ICML2019：Google和Facebook在推进哪些方向？

ICML2019：Google和Facebook在推进哪些方向？

中国人工智能学会

5+阅读 · 2019年6月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

Arxiv

0+阅读 · 2021年5月17日

Sentence Similarity Based on Contexts

Arxiv

1+阅读 · 2021年5月17日

Improving Vulnerability Prediction of JavaScript Functions Using Process Metrics

Arxiv

0+阅读 · 2021年5月16日

FLERT: Document-Level Features for Named Entity Recognition

Arxiv

1+阅读 · 2021年5月14日

Towards Human-Free Automatic Quality Evaluation of German Summarization

Arxiv

0+阅读 · 2021年5月13日

Generating Question Relevant Captions to Aid Visual Question Answering

Generating Question Relevant Captions to Aid Visual Question Answering

Arxiv

5+阅读 · 2019年9月9日

BERTScore: Evaluating Text Generation with BERT

Arxiv

5+阅读 · 2019年4月21日

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Arxiv

4+阅读 · 2018年7月4日

A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets

Arxiv

5+阅读 · 2018年2月14日

PEYMA: A Tagged Corpus for Persian Named Entities

Arxiv

5+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员