RAW-C: 含义模糊的言语的关联性(英文新词汇资源) (RAW-C: Relatedness of Ambiguous Words--in Context (A New Lexical Resource for English)) - 专知论文

会员服务 ·

0

余弦 · 欠估计 · 相似度 · 过估计 · 词义消歧 ·

2021 年 5 月 27 日

RAW-C: Relatedness of Ambiguous Words--in Context (A New Lexical Resource for English)

翻译：RAW-C: 含义模糊的言语的关联性(英文新词汇资源)

Sean Trott,Benjamin Bergen

from arxiv, ACL-IJCNLP 2021 camera-ready

Most words are ambiguous--i.e., they convey distinct meanings in different contexts--and even the meanings of unambiguous words are context-dependent. Both phenomena present a challenge for NLP. Recently, the advent of contextualized word embeddings has led to success on tasks involving lexical ambiguity, such as Word Sense Disambiguation. However, there are few tasks that directly evaluate how well these contextualized embeddings accommodate the more continuous, dynamic nature of word meaning--particularly in a way that matches human intuitions. We introduce RAW-C, a dataset of graded, human relatedness judgments for 112 ambiguous words in context (with 672 sentence pairs total), as well as human estimates of sense dominance. The average inter-annotator agreement (assessed using a leave-one-annotator-out method) was 0.79. We then show that a measure of cosine distance, computed using contextualized embeddings from BERT and ELMo, correlates with human judgments, but that cosine distance also systematically underestimates how similar humans find uses of the same sense of a word to be, and systematically overestimates how similar humans find uses of different-sense homonyms. Finally, we propose a synthesis between psycholinguistic theories of the mental lexicon and computational models of lexical semantics.

翻译：多数字是模糊的,也就是说,它们在不同的背景中传达了不同的含义,甚至明确字的含义也取决于背景。两种现象都给NLP带来了挑战。最近,背景化字嵌入的出现导致在涉及词法模糊性的任务上取得成功,例如Word Sense Disandergulation。然而,没有多少任务直接评估这些背景化字嵌入在多大程度上适应了更连续、更动态的字义含义,特别是以与人类直觉相匹配的方式。我们引入了RAW-C,即112个背景(总共672对判刑)模糊字的分级和人相关判断的数据集,以及人类对感官支配力的估计。平均的批发人间协议(使用休假单词标记方法评估)是0.79。我们然后展示了一种测量线性距离的尺度,用与BERT和ELMO的相匹配的缩放相匹配的词嵌入模型来计算,与人类的判断相关,但同距离还系统地低估了人类如何使用同样感知的智性思维,最终又系统地利用了一种神经学的模型,并超越了我们所测的宗教的逻辑。

0

相关内容

【KDD2021】识别不确定性的可靠文本分类

专知会员服务

26+阅读 · 2021年7月17日

【万字长文】注意力机制可解释大论述

专知会员服务

55+阅读 · 2020年11月17日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

已删除

将门创投

18+阅读 · 2019年2月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

专知

24+阅读 · 2018年3月31日

论文浅尝 | Improved Neural Relation Detection for KBQA

论文浅尝 | Improved Neural Relation Detection for KBQA

开放知识图谱

13+阅读 · 2018年1月21日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

A Comparison of Methods for OOV-word Recognition on a New Public Dataset

Arxiv

0+阅读 · 2021年7月16日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Arxiv

8+阅读 · 2020年3月3日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

Arxiv

3+阅读 · 2019年5月10日

Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness

Arxiv

3+阅读 · 2019年2月21日

Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation

Arxiv

4+阅读 · 2018年4月26日

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Arxiv

3+阅读 · 2018年2月1日

PEYMA: A Tagged Corpus for Persian Named Entities

Arxiv

5+阅读 · 2018年1月30日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2021】识别不确定性的可靠文本分类

专知会员服务

26+阅读 · 2021年7月17日

【万字长文】注意力机制可解释大论述

专知会员服务

55+阅读 · 2020年11月17日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

已删除

将门创投

18+阅读 · 2019年2月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

专知

24+阅读 · 2018年3月31日

论文浅尝 | Improved Neural Relation Detection for KBQA

论文浅尝 | Improved Neural Relation Detection for KBQA

开放知识图谱

13+阅读 · 2018年1月21日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

相关论文

A Comparison of Methods for OOV-word Recognition on a New Public Dataset

Arxiv

0+阅读 · 2021年7月16日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Arxiv

8+阅读 · 2020年3月3日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

Arxiv

3+阅读 · 2019年5月10日

Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness

Arxiv

3+阅读 · 2019年2月21日

Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation

Arxiv

4+阅读 · 2018年4月26日

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Arxiv

3+阅读 · 2018年2月1日

PEYMA: A Tagged Corpus for Persian Named Entities

Arxiv

5+阅读 · 2018年1月30日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员