BERT 学习作为人类的常识吗? 通过Lexica理解语言风格 (Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica) - 专知论文

会员服务 ·

0

可理解性 · BERT · 学成 · 标注 · 数据集 ·

2021 年 11 月 12 日

Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica

翻译：BERT 学习作为人类的常识吗? 通过Lexica理解语言风格

Shirley Anugrah Hayati,Dongyeop Kang,Lyle Ungar

from arxiv, Accepted at EMNLP 2021 Main Conference, updated typos and Appendix

People convey their intention and attitude through linguistic styles of the text that they write. In this study, we investigate lexicon usages across styles throughout two lenses: human perception and machine word importance, since words differ in the strength of the stylistic cues that they provide. To collect labels of human perception, we curate a new dataset, Hummingbird, on top of benchmarking style datasets. We have crowd workers highlight the representative words in the text that makes them think the text has the following styles: politeness, sentiment, offensiveness, and five emotion types. We then compare these human word labels with word importance derived from a popular fine-tuned style classifier like BERT. Our results show that the BERT often finds content words not relevant to the target style as important words used in style prediction, but humans do not perceive the same way even though for some styles (e.g., positive sentiment and joy) human- and machine-identified words share significant overlap for some styles.

翻译：人们通过他们所写的文字的语言风格表达其意图和态度。在这项研究中,我们通过两个镜头来调查不同风格的词汇用法:人类感知和机器词的重要性,因为文字在它们提供的文体提示的强度上有所不同。为了收集人类感知的标签,我们在基准样式数据集的顶端翻译了一个新的数据集Humingbird。我们有人群工人在文本中突出有代表性的词句,使他们认为文字有以下风格:礼貌、情绪、冒犯和五个情感类型。然后我们将这些人类字名标签与流行的微调风格分类师(如BERT)的词重要性进行比较。我们的结果显示,BERT常常发现与目标风格的文字无关的内容词与在风格预测中使用的重要词句不同,但人类甚至对某些风格(如正面情绪和喜悦)的人和机器识别的词与某些风格有重大重叠。

0

相关内容

可理解性

15%接受率！AAAI2022结果出炉，1349篇上榜，你的paper中了吗？

15%接受率！AAAI2022结果出炉，1349篇上榜，你的paper中了吗？

专知会员服务

37+阅读 · 2021年12月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

专知会员服务

20+阅读 · 2020年5月12日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

40+阅读 · 2020年5月4日

【ACL2020-Allen AI】预训练语言模型中的无监督域聚类

【ACL2020-Allen AI】预训练语言模型中的无监督域聚类

专知会员服务

24+阅读 · 2020年4月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

Probing Linguistic Information For Logical Inference In Pre-trained Language Models

Arxiv

5+阅读 · 2021年12月3日

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning

Arxiv

3+阅读 · 2019年9月4日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

BERT for Joint Intent Classification and Slot Filling

Arxiv

13+阅读 · 2019年2月28日

Passage Re-ranking with BERT

Arxiv

4+阅读 · 2019年2月18日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

VIP会员

文章信息

相关主题

相关VIP内容

15%接受率！AAAI2022结果出炉，1349篇上榜，你的paper中了吗？

15%接受率！AAAI2022结果出炉，1349篇上榜，你的paper中了吗？

专知会员服务

37+阅读 · 2021年12月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

【ACL2020-Google】BLEURT:一种基于迁移学习的自然语言生成度量

专知会员服务

20+阅读 · 2020年5月12日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

40+阅读 · 2020年5月4日

【ACL2020-Allen AI】预训练语言模型中的无监督域聚类

【ACL2020-Allen AI】预训练语言模型中的无监督域聚类

专知会员服务

24+阅读 · 2020年4月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美军“泰坦（TITAN）地面站目标系统”：是颠覆还是一场可预见的军事进步？

美空军指挥参谋学院 · 联合空中作战规划课程介绍（2025年） | 22页

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

北约第十七届（2025年）网络冲突国际会议论文集 | 272页

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

相关论文

Probing Linguistic Information For Logical Inference In Pre-trained Language Models

Arxiv

5+阅读 · 2021年12月3日

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning

Arxiv

3+阅读 · 2019年9月4日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

BERT for Joint Intent Classification and Slot Filling

Arxiv

13+阅读 · 2019年2月28日

Passage Re-ranking with BERT

Arxiv

4+阅读 · 2019年2月18日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

微信扫码咨询专知VIP会员