SafeWebUH在SemEval-2023任务11中的表现：学习侮辱性文本中的注释者不一致性：直接训练与聚合比较 (SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation) - 专知论文

会员服务 ·

0

semeval · 注释（编程） · 不一致性 · 交叉熵 · 一致 ·

2023 年 5 月 1 日

SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation

翻译：SafeWebUH在SemEval-2023任务11中的表现：学习侮辱性文本中的注释者不一致性：直接训练与聚合比较

Sadat Shahriar,Thamar Solorio

from arxiv, SemEval Task 11 paper (System)

Subjectivity and difference of opinion are key social phenomena, and it is crucial to take these into account in the annotation and detection process of derogatory textual content. In this paper, we use four datasets provided by SemEval-2023 Task 11 and fine-tune a BERT model to capture the disagreement in the annotation. We find individual annotator modeling and aggregation lowers the Cross-Entropy score by an average of 0.21, compared to the direct training on the soft labels. Our findings further demonstrate that annotator metadata contributes to the average 0.029 reduction in the Cross-Entropy score.

翻译：主观性和不同意见是关键的社会现象，在侮辱性文本内容的注释和检测过程中考虑这些因素至关重要。在本文中，我们使用SemEval-2023任务11提供的四个数据集，并微调BERT模型来捕捉注释中的不一致性。我们发现单个注释者建模和聚合可以将交叉熵分数平均降低0.21，而直接训练软标签则更低。我们的发现进一步证明，注释者元数据有助于平均交叉熵分数降低0.029。

0

相关内容

semeval

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

专知会员服务

18+阅读 · 2022年2月26日

【AAAI2022】上下文感知的词语替换与文本溯源

【AAAI2022】上下文感知的词语替换与文本溯源

专知会员服务

18+阅读 · 2022年1月23日

【ICML2020】统一预训练伪掩码语言模型

【ICML2020】统一预训练伪掩码语言模型

专知会员服务

27+阅读 · 2020年7月23日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

使用BERT做文本摘要

使用BERT做文本摘要

专知

23+阅读 · 2019年12月7日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

短文本情感分析关键技术研究

国家自然科学基金

9+阅读 · 2015年12月31日

AG-WUS-PcG-lncRNA互作对梅多雌蕊发育的调控

国家自然科学基金

0+阅读 · 2015年12月31日

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向多类图像分类的众包主动学习方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

Orexin A在下丘脑-海马通路介导吗啡成瘾中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Realized GARCH框架的波动率和相关性模型理论和应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

谷氨酸棒杆菌中3-羟基苯甲酸/龙胆酸代谢途径的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

内隐类别学习的认知神经机制

国家自然科学基金

0+阅读 · 2012年12月31日

HIC1调控CIITA转录机制研究及其在B细胞分化中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR激活对吗啡耐受的调控及其分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-View Class Incremental Learning

Arxiv

0+阅读 · 2023年6月16日

Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca

Arxiv

0+阅读 · 2023年6月15日

Improving Reading Comprehension Question Generation with Data Augmentation and Overgenerate-and-rank

Arxiv

0+阅读 · 2023年6月15日

A Proxy-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks

Arxiv

0+阅读 · 2023年6月14日

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

Arxiv

0+阅读 · 2023年6月14日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Arxiv

10+阅读 · 2020年10月6日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

VIP会员

文章信息

相关主题

注释（编程）

相关VIP内容

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

专知会员服务

18+阅读 · 2022年2月26日

【AAAI2022】上下文感知的词语替换与文本溯源

【AAAI2022】上下文感知的词语替换与文本溯源

专知会员服务

18+阅读 · 2022年1月23日

【ICML2020】统一预训练伪掩码语言模型

【ICML2020】统一预训练伪掩码语言模型

专知会员服务

27+阅读 · 2020年7月23日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《太空边缘（临近空间）的武器化？军事高空平台的进展与前景》

《利用星基增强系统（SBAS）信号进行射频干扰（RFI）检测与特征分析》

美陆军在“艾布拉姆斯”坦克与“布拉德利”步战车上测试“牛蛙”反无人机炮塔

《军事领域特性及其对军事人工智能应用的影响》

相关资讯

使用BERT做文本摘要

使用BERT做文本摘要

专知

23+阅读 · 2019年12月7日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Multi-View Class Incremental Learning

Arxiv

0+阅读 · 2023年6月16日

Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca

Arxiv

0+阅读 · 2023年6月15日

Improving Reading Comprehension Question Generation with Data Augmentation and Overgenerate-and-rank

Arxiv

0+阅读 · 2023年6月15日

A Proxy-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks

Arxiv

0+阅读 · 2023年6月14日

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

Arxiv

0+阅读 · 2023年6月14日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Arxiv

10+阅读 · 2020年10月6日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

相关基金

短文本情感分析关键技术研究

国家自然科学基金

9+阅读 · 2015年12月31日

AG-WUS-PcG-lncRNA互作对梅多雌蕊发育的调控

国家自然科学基金

0+阅读 · 2015年12月31日

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向多类图像分类的众包主动学习方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

Orexin A在下丘脑-海马通路介导吗啡成瘾中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Realized GARCH框架的波动率和相关性模型理论和应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

谷氨酸棒杆菌中3-羟基苯甲酸/龙胆酸代谢途径的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

内隐类别学习的认知神经机制

国家自然科学基金

0+阅读 · 2012年12月31日

HIC1调控CIITA转录机制研究及其在B细胞分化中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR激活对吗啡耐受的调控及其分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员