一个说明是否足够? 一个以数据为中心的图像分类基准,用于响亮和模糊的标签估计 (Is one annotation enough? A data-centric image classification benchmark for noisy and ambiguous label estimation) - 专知论文

会员服务 ·

0

图片分类 · 估计/估计量 · 标注 · Analysis · 可辨认的 ·

2022 年 7 月 13 日

Is one annotation enough? A data-centric image classification benchmark for noisy and ambiguous label estimation

翻译：一个说明是否足够? 一个以数据为中心的图像分类基准,用于响亮和模糊的标签估计

Lars Schmarje,Vasco Grossmann,Claudius Zelenka,Sabine Dippel,Rainer Kiko,Mariusz Oszust,Matti Pastell,Jenny Stracke,Anna Valros,Nina Volkmann,Reinahrd Koch

from arxiv, Data, supplementary and source code will be released soon

High-quality data is necessary for modern machine learning. However, the acquisition of such data is difficult due to noisy and ambiguous annotations of humans. The aggregation of such annotations to determine the label of an image leads to a lower data quality. We propose a data-centric image classification benchmark with nine real-world datasets and multiple annotations per image to investigate and quantify the impact of such data quality issues. We focus on a data-centric perspective by asking how we could improve the data quality. Across thousands of experiments, we show that multiple annotations allow a better approximation of the real underlying class distribution. We identify that hard labels can not capture the ambiguity of the data and this might lead to the common issue of overconfident models. Based on the presented datasets, benchmark baselines, and analysis, we create multiple research opportunities for the future.

翻译：现代机器学习需要高质量的数据。但是,由于人类的杂音和模糊的描述,很难获取这些数据。将这类说明汇总以确定图像的标签导致数据质量下降。我们建议采用以数据为中心的图像分类基准,每个图像有9个真实世界数据集和多个说明,以调查和量化数据质量问题的影响。我们侧重于以数据为中心的视角,询问如何提高数据质量。在数千个实验中,我们显示多个说明可以更好地接近真实的底层分类分布。我们发现硬标签无法捕捉数据的模糊性,这可能导致过分自信模式的常见问题。根据所提供的数据集、基准基线和分析,我们为未来创造多种研究机会。

0

相关内容

图片分类

图像分类，顾名思义，是一个输入图像，输出对该图像内容分类的描述的问题。它是计算机视觉的核心，实际应用广泛。

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

半导体晶体基人工光合成体系的研究

国家自然科学基金

0+阅读 · 2014年12月31日

多因素耦合的激光熔覆层应力及应力损伤的超声波评价物理机制

国家自然科学基金

0+阅读 · 2013年12月31日

GB-InSAR监测高速铁路高精度三维形变关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

地基InSAR高边坡三维变形提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

无界区域最优控制问题的无限元方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

ADAMTS-4/5和Aggrecan基因改造的间充质干细胞和软骨细胞在软骨组织工程上的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于特征关联与校准机制的图像隐写分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

危险目标陨落期预报的置信区间估计及非线性滤波方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

Toward Robust Autotuning of Noisy Quantum Dot Devices

Arxiv

0+阅读 · 2022年9月8日

A Survey on Data Augmentation for Text Classification

Arxiv

0+阅读 · 2022年9月8日

What does a platypus look like? Generating customized prompts for zero-shot image classification

Arxiv

0+阅读 · 2022年9月7日

Robust Self-Ensembling Network for Hyperspectral Image Classification

Arxiv

0+阅读 · 2022年9月7日

Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning

Arxiv

0+阅读 · 2022年9月7日

Improving Self-supervised Learning for Out-of-distribution Task via Auxiliary Classifier

Arxiv

0+阅读 · 2022年9月7日

OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction

Arxiv

0+阅读 · 2022年9月6日

Robust and Efficient Imbalanced Positive-Unlabeled Learning with Self-supervision

Arxiv

0+阅读 · 2022年9月6日

An Indoor Localization Dataset and Data Collection Framework with High Precision Position Annotation

Arxiv

0+阅读 · 2022年9月6日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Toward Robust Autotuning of Noisy Quantum Dot Devices

Arxiv

0+阅读 · 2022年9月8日

A Survey on Data Augmentation for Text Classification

Arxiv

0+阅读 · 2022年9月8日

What does a platypus look like? Generating customized prompts for zero-shot image classification

Arxiv

0+阅读 · 2022年9月7日

Robust Self-Ensembling Network for Hyperspectral Image Classification

Arxiv

0+阅读 · 2022年9月7日

Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning

Arxiv

0+阅读 · 2022年9月7日

Improving Self-supervised Learning for Out-of-distribution Task via Auxiliary Classifier

Arxiv

0+阅读 · 2022年9月7日

OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction

Arxiv

0+阅读 · 2022年9月6日

Robust and Efficient Imbalanced Positive-Unlabeled Learning with Self-supervision

Arxiv

0+阅读 · 2022年9月6日

An Indoor Localization Dataset and Data Collection Framework with High Precision Position Annotation

Arxiv

0+阅读 · 2022年9月6日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

相关基金

半导体晶体基人工光合成体系的研究

国家自然科学基金

0+阅读 · 2014年12月31日

多因素耦合的激光熔覆层应力及应力损伤的超声波评价物理机制

国家自然科学基金

0+阅读 · 2013年12月31日

GB-InSAR监测高速铁路高精度三维形变关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

地基InSAR高边坡三维变形提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

无界区域最优控制问题的无限元方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

ADAMTS-4/5和Aggrecan基因改造的间充质干细胞和软骨细胞在软骨组织工程上的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于特征关联与校准机制的图像隐写分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

危险目标陨落期预报的置信区间估计及非线性滤波方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员