哪些要素有助于为难以完成的NLU数据收集任务制定有效的《众包协议》? (What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?) - 专知论文

会员服务 ·

0

NLU · 可理解性 · 样例 · MoDELS · Boosting（一种模型训练加速方式） ·

2021 年 6 月 1 日

What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?

翻译：哪些要素有助于为难以完成的NLU数据收集任务制定有效的《众包协议》?

Nikita Nangia,Saku Sugawara,Harsh Trivedi,Alex Warstadt,Clara Vania,Samuel R. Bowman

from arxiv, ACL 2021

Crowdsourcing is widely used to create data for common natural language understanding tasks. Despite the importance of these datasets for measuring and refining model understanding of language, there has been little focus on the crowdsourcing methods used for collecting the datasets. In this paper, we compare the efficacy of interventions that have been proposed in prior work as ways of improving data quality. We use multiple-choice question answering as a testbed and run a randomized trial by assigning crowdworkers to write questions under one of four different data collection protocols. We find that asking workers to write explanations for their examples is an ineffective stand-alone strategy for boosting NLU example difficulty. However, we find that training crowdworkers, and then using an iterative process of collecting data, sending feedback, and qualifying workers based on expert judgments is an effective means of collecting challenging data. But using crowdsourced, instead of expert judgments, to qualify workers and send feedback does not prove to be effective. We observe that the data from the iterative protocol with expert assessments is more challenging by several measures. Notably, the human--model gap on the unanimous agreement portion of this data is, on average, twice as large as the gap for the baseline protocol data.

翻译：尽管这些数据集对于衡量和完善对语言的示范理解十分重要,但对收集数据集所使用的众包方法却没有多少重视。在本文中,我们比较了先前工作中提出的干预措施的效力,以此作为提高数据质量的有效手段。我们使用多种选择问题作为测试台,通过指派众组工人在四种不同的数据收集协议中的一种协议下写问题进行随机试验。我们发现,要求工人为其案例写解释是提高NLU实例难度的一种无效的独立战略。然而,我们发现,培训众组工人,然后根据专家判断使用数据收集、发送反馈和合格工人的迭接程序,是收集具有挑战性的数据的有效手段。但是,利用众包而不是专家的判断,对工人进行资格认证和发送反馈并不有效。我们发现,由专家评估的迭接协议提供的数据由于若干措施而更具挑战性。值得注意的是,关于这一数据一致协议部分的人类模型差距平均是基线协议差距的两倍。

0

相关内容

NLU

【ICML2021】数据表示的几何评估

专知会员服务

38+阅读 · 2021年6月3日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

专知会员服务

24+阅读 · 2020年3月31日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?

Arxiv

1+阅读 · 2021年7月26日

Extending Challenge Sets to Uncover Gender Bias in Machine Translation: Impact of Stereotypical Verbs and Adjectives

Arxiv

0+阅读 · 2021年7月24日

An interview method for engagement with personal sensor data

Arxiv

0+阅读 · 2021年7月23日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

The Knowledge Within: Methods for Data-Free Model Compression

The Knowledge Within: Methods for Data-Free Model Compression

Arxiv

3+阅读 · 2019年12月3日

A BERT Baseline for the Natural Questions

Arxiv

8+阅读 · 2019年3月21日

Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness

Arxiv

3+阅读 · 2019年2月21日

Adversarial TableQA: Attention Supervision for Question Answering on Tables

Arxiv

4+阅读 · 2018年10月18日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

The Case for Automatic Database Administration using Deep Reinforcement Learning

Arxiv

3+阅读 · 2018年1月17日

VIP会员

文章信息

相关主题

Boosting（一种模型训练加速方式）

相关VIP内容

【ICML2021】数据表示的几何评估

专知会员服务

38+阅读 · 2021年6月3日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

专知会员服务

24+阅读 · 2020年3月31日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

相关论文

Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?

Arxiv

1+阅读 · 2021年7月26日

Extending Challenge Sets to Uncover Gender Bias in Machine Translation: Impact of Stereotypical Verbs and Adjectives

Arxiv

0+阅读 · 2021年7月24日

An interview method for engagement with personal sensor data

Arxiv

0+阅读 · 2021年7月23日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

The Knowledge Within: Methods for Data-Free Model Compression

The Knowledge Within: Methods for Data-Free Model Compression

Arxiv

3+阅读 · 2019年12月3日

A BERT Baseline for the Natural Questions

Arxiv

8+阅读 · 2019年3月21日

Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness

Arxiv

3+阅读 · 2019年2月21日

Adversarial TableQA: Attention Supervision for Question Answering on Tables

Arxiv

4+阅读 · 2018年10月18日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

The Case for Automatic Database Administration using Deep Reinforcement Learning

Arxiv

3+阅读 · 2018年1月17日

微信扫码咨询专知VIP会员