CROWDLAB: 受监督学习推断与多个通知员的一致标签和质量分数 (CROWDLAB: Supervised learning to infer consensus labels and quality scores for data with multiple annotators) - 专知论文

会员服务 ·

0

标注 · 样例 · 估计/估计量 · 推断 · 得分 ·

2023 年 1 月 27 日

CROWDLAB: Supervised learning to infer consensus labels and quality scores for data with multiple annotators

翻译：CROWDLAB: 受监督学习推断与多个通知员的一致标签和质量分数

Hui Wen Goh,Ulyana Tkachenko,Jonas Mueller

Real-world data for classification is often labeled by multiple annotators. For analyzing such data, we introduce CROWDLAB, a straightforward approach to utilize any trained classifier to estimate: (1) A consensus label for each example that aggregates the available annotations; (2) A confidence score for how likely each consensus label is correct; (3) A rating for each annotator quantifying the overall correctness of their labels. Existing algorithms to estimate related quantities in crowdsourcing often rely on sophisticated generative models with iterative inference. CROWDLAB instead uses a straightforward weighted ensemble. Existing algorithms often rely solely on annotator statistics, ignoring the features of the examples from which the annotations derive. CROWDLAB utilizes any classifier model trained on these features, and can thus better generalize between examples with similar features. On real-world multi-annotator image data, our proposed method provides superior estimates for (1)-(3) than existing algorithms like Dawid-Skene/GLAD.

翻译：用于分类的实际世界数据往往被多个说明者贴上标签。为了分析这些数据,我们引入了CROWDLAB, 这是一种直接的方法,利用任何经过培训的分类师来估算:(1) 对现有说明加以综合的每个示例的一致标签;(2) 每个协商一致标签的正确可能性的可信度;(3) 对每个说明员的评级,以量化其标签的整体正确性;在众包中估计相关数量的现有算法往往依赖具有迭代引力的精密基因化模型;CROWDLAB 使用一个直接的加权合用词。现有的算法往往仅仅依赖注释统计,而忽略说明所引出的例子的特征;CROWDLAB使用任何经过有关这些特征培训的分类者模型,从而可以更好地概括具有类似特征的示例。在真实世界多说明者图像数据中,我们提出的方法为(1)-(3)提供了比现有算法(如Daid-Skene/GLAD)的更高级估计。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

GLDPC码编译码算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

沥青路面动态响应的随机性分析模型构建

国家自然科学基金

0+阅读 · 2013年12月31日

微纳结构铌酸锂波导高速电光调制器

国家自然科学基金

0+阅读 · 2013年12月31日

核-壳界面构造与量子点光学性质单分散

国家自然科学基金

0+阅读 · 2011年12月31日

西北典型城镇区（带）城乡一体化的空间模式及规划方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

3D Concept Learning and Reasoning from Multi-View Images

Arxiv

0+阅读 · 2023年3月20日

Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data

Arxiv

0+阅读 · 2023年3月20日

Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition

Arxiv

2+阅读 · 2023年3月20日

Multiple change point detection in tensors

Arxiv

0+阅读 · 2023年3月18日

Data-Centric Learning from Unlabeled Graphs with Diffusion Model

Arxiv

0+阅读 · 2023年3月17日

Refinement for Absolute Pose Regression with Neural Feature Synthesis

Arxiv

0+阅读 · 2023年3月17日

Automaton-Based Representations of Task Knowledge from Generative Language Models

Arxiv

0+阅读 · 2023年3月16日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts

Arxiv

11+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

3D Concept Learning and Reasoning from Multi-View Images

Arxiv

0+阅读 · 2023年3月20日

Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data

Arxiv

0+阅读 · 2023年3月20日

Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition

Arxiv

2+阅读 · 2023年3月20日

Multiple change point detection in tensors

Arxiv

0+阅读 · 2023年3月18日

Data-Centric Learning from Unlabeled Graphs with Diffusion Model

Arxiv

0+阅读 · 2023年3月17日

Refinement for Absolute Pose Regression with Neural Feature Synthesis

Arxiv

0+阅读 · 2023年3月17日

Automaton-Based Representations of Task Knowledge from Generative Language Models

Arxiv

0+阅读 · 2023年3月16日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts

Arxiv

11+阅读 · 2018年5月21日

相关基金

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

GLDPC码编译码算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

沥青路面动态响应的随机性分析模型构建

国家自然科学基金

0+阅读 · 2013年12月31日

微纳结构铌酸锂波导高速电光调制器

国家自然科学基金

0+阅读 · 2013年12月31日

核-壳界面构造与量子点光学性质单分散

国家自然科学基金

0+阅读 · 2011年12月31日

西北典型城镇区（带）城乡一体化的空间模式及规划方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员