隐性培训非专家说明员的批注课程 (Annotation Curricula to Implicitly Train Non-Expert Annotators) - 专知论文

会员服务 ·

0

可辨认的 · 可约的 · INTERACT · 示例 · SimPLe ·

2021 年 6 月 9 日

Annotation Curricula to Implicitly Train Non-Expert Annotators

翻译：隐性培训非专家说明员的批注课程

Ji-Ung Lee,Jan-Christoph Klie,Iryna Gurevych

Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; especially in citizen science or crowd sourcing scenarios where domain expertise is not required and only annotation guidelines are provided. To alleviate these issues, we propose annotation curricula, a novel approach to implicitly train annotators. Our goal is to gradually introduce annotators into the task by ordering instances that are annotated according to a learning curriculum. To do so, we first formalize annotation curricula for sentence- and paragraph-level annotation tasks, define an ordering strategy, and identify well-performing heuristics and interactively trained models on three existing English datasets. We then conduct a user study with 40 voluntary participants who are asked to identify the most fitting misconception for English tweets about the Covid-19 pandemic. Our results show that using a simple heuristic to order instances can already significantly reduce the total annotation time while preserving a high annotation quality. Annotation curricula thus can provide a novel way to improve data collection. To facilitate future research, we further share our code and data consisting of 2,400 annotations.

翻译：为缓解这些问题,我们提议了批注课程,这是一种隐含培训说明员的新办法。我们的目标是通过根据学习课程订购附加说明的事例,逐步在任务中引入批注员。为了做到这一点,我们首先将判决和段落级批注任务的批注课程正规化,确定排序战略,并找出三个现有英国数据集的良好超常和互动培训模式。我们然后与40名自愿参与者进行用户研究,请他们查明有关Covid-19大流行的英语推文最恰当的错误。我们的结果显示,使用简单的超常来排序实例可以大大缩短批注时间,同时保持较高的批注质量。因此,批注课程可以提供改进数据收集的新颖的方法。

0

相关内容

可辨认的

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

专知会员服务

100+阅读 · 2020年3月8日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知会员服务

131+阅读 · 2020年3月7日

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

专知会员服务

197+阅读 · 2020年2月1日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

8+阅读 · 2018年10月31日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Changes in European Solidarity Before and During COVID-19: Evidence from a Large Crowd- and Expert-Annotated Twitter Dataset

Arxiv

0+阅读 · 2021年8月2日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

37+阅读 · 2021年8月2日

COfEE: A Comprehensive Ontology for Event Extraction from text, with an online annotation tool

Arxiv

0+阅读 · 2021年8月1日

On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals

Arxiv

0+阅读 · 2021年7月30日

Strategically using Applied Machine Learning for Accessibility Documentation in the Built Environment

Arxiv

0+阅读 · 2021年7月30日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction

Arxiv

9+阅读 · 2018年2月28日

Adversarial Learning for Chinese NER from Crowd Annotations

Arxiv

15+阅读 · 2018年1月16日

Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time

Arxiv

6+阅读 · 2018年1月10日

VIP会员

文章信息

相关主题

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

【2020Manning新书】人工智能成功之道，272页pdf，Succeeding with AI

专知会员服务

100+阅读 · 2020年3月8日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知会员服务

131+阅读 · 2020年3月7日

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

专知会员服务

197+阅读 · 2020年2月1日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

8+阅读 · 2018年10月31日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

相关论文

Changes in European Solidarity Before and During COVID-19: Evidence from a Large Crowd- and Expert-Annotated Twitter Dataset

Arxiv

0+阅读 · 2021年8月2日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

37+阅读 · 2021年8月2日

COfEE: A Comprehensive Ontology for Event Extraction from text, with an online annotation tool

Arxiv

0+阅读 · 2021年8月1日

On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals

Arxiv

0+阅读 · 2021年7月30日

Strategically using Applied Machine Learning for Accessibility Documentation in the Built Environment

Arxiv

0+阅读 · 2021年7月30日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction

Arxiv

9+阅读 · 2018年2月28日

Adversarial Learning for Chinese NER from Crowd Annotations

Arxiv

15+阅读 · 2018年1月16日

Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time

Arxiv

6+阅读 · 2018年1月10日

微信扫码咨询专知VIP会员