在多选择群集外包中回收顶层两个答案和共解概率 (Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing) - 专知论文

会员服务 ·

0

标注 · 真实值 · 推断 · Minimax · MoDELS ·

2022 年 12 月 29 日

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing

翻译：在多选择群集外包中回收顶层两个答案和共解概率

Hyeonsu Jeong,Hye Won Chung

Crowdsourcing has emerged as an effective platform to label a large volume of data in a cost- and time-efficient manner. Most previous works have focused on designing an efficient algorithm to recover only the ground-truth labels of the data. In this paper, we consider multi-choice crowdsourced labeling with the goal of recovering not only the ground truth but also the most confusing answer and the confusion probability. The most confusing answer provides useful information about the task by revealing the most plausible answer other than the ground truth and how plausible it is. To theoretically analyze such scenarios, we propose a model where there are top-two plausible answers for each task, distinguished from the rest of choices. Task difficulty is quantified by the confusion probability between the top two, and worker reliability is quantified by the probability of giving an answer among the top two. Under this model, we propose a two-stage inference algorithm to infer the top-two answers as well as the confusion probability. We show that our algorithm achieves the minimax optimal convergence rate. We conduct both synthetic and real-data experiments and demonstrate that our algorithm outperforms other recent algorithms. We also show the applicability of our algorithms in inferring the difficulty of tasks and training neural networks with the soft labels composed of the top-two most plausible classes.

翻译：众包已经成为一个有效的平台,以成本和时间高效的方式给大量数据贴上成本和时间效率高的标签。以往的多数工作都侧重于设计一种有效的算法, 以便只恢复数据的地面真实标签。在本文中, 我们考虑多选择众包标签, 目的不仅是要恢复地面真相, 而且要找到最令人困惑的答案和混乱的概率。最令人困惑的答案通过揭示地面真相以外的最可信的答案, 提供了有关这项任务的有用信息。在理论上分析这些假设时, 我们提出了一个模型, 其中每个任务都有最上两个可信的答案, 区别于其余的选择。任务难度通过头两个任务之间的混乱概率来量化, 而工人的可靠性则通过在头两个任务之间给出答案的可能性来量化。在这个模型中, 我们提出一个两阶段的推论算法, 来推导出头两个答案以及混乱的可能性。我们显示我们的算法达到了最小型的最佳趋同率。我们进行合成和真实的数据实验, 并表明我们的算法比其他最近的算法要优于最近算法的软性。我们还用最难的等级的算算算出我们最难的等级的算算。

0

相关内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

变分框架下的一类非局部的椭圆问题

国家自然科学基金

0+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Top夸克物理及其相关物理问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt-Notch和Wnt-ERBB信号通路调控NSCLC上皮间质转化和耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

百脉根AP2/ERF转录因子LcSRA1耐盐胁迫应答的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

miRNA相关基因SNPs、肺炎衣原体感染与原发性肺癌的分子流行病学研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

间质液压与口腔鳞癌细胞恶性演进及相关分子机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ramsey－CPT原子频标研制

国家自然科学基金

0+阅读 · 2009年12月31日

Dimension-reduced KRnet maps for high-dimensional inverse problems

Arxiv

0+阅读 · 2023年3月1日

Computing All Restricted Skyline Probabilities for Uncertain Data

Arxiv

0+阅读 · 2023年3月1日

Efficient Approximate Recovery from Pooled Data Using Doubly Regular Pooling Schemes

Arxiv

0+阅读 · 2023年2月28日

An Efficient Tester-Learner for Halfspaces

Arxiv

0+阅读 · 2023年2月28日

Learning to Estimate Two Dense Depths from LiDAR and Event Data

Arxiv

0+阅读 · 2023年2月28日

UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction

Arxiv

0+阅读 · 2023年2月27日

An accurate and efficient approach to probabilistic conflict prediction

Arxiv

0+阅读 · 2023年2月26日

ACon$^2$: Adaptive Conformal Consensus for Provable Blockchain Oracles

Arxiv

0+阅读 · 2023年2月25日

Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?

Arxiv

0+阅读 · 2023年2月24日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

相关论文

Dimension-reduced KRnet maps for high-dimensional inverse problems

Arxiv

0+阅读 · 2023年3月1日

Computing All Restricted Skyline Probabilities for Uncertain Data

Arxiv

0+阅读 · 2023年3月1日

Efficient Approximate Recovery from Pooled Data Using Doubly Regular Pooling Schemes

Arxiv

0+阅读 · 2023年2月28日

An Efficient Tester-Learner for Halfspaces

Arxiv

0+阅读 · 2023年2月28日

Learning to Estimate Two Dense Depths from LiDAR and Event Data

Arxiv

0+阅读 · 2023年2月28日

UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction

Arxiv

0+阅读 · 2023年2月27日

An accurate and efficient approach to probabilistic conflict prediction

Arxiv

0+阅读 · 2023年2月26日

ACon$^2$: Adaptive Conformal Consensus for Provable Blockchain Oracles

Arxiv

0+阅读 · 2023年2月25日

Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?

Arxiv

0+阅读 · 2023年2月24日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

相关基金

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

变分框架下的一类非局部的椭圆问题

国家自然科学基金

0+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Top夸克物理及其相关物理问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt-Notch和Wnt-ERBB信号通路调控NSCLC上皮间质转化和耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

百脉根AP2/ERF转录因子LcSRA1耐盐胁迫应答的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

miRNA相关基因SNPs、肺炎衣原体感染与原发性肺癌的分子流行病学研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

间质液压与口腔鳞癌细胞恶性演进及相关分子机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ramsey－CPT原子频标研制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员