基于大规模人类评价的图像描述质量估算 (Quality Estimation for Image Captions Based on Large-scale Human Evaluations) - 专知论文

会员服务 ·

0

图像字幕 · 估计/估计量 · MoDELS · 张成子空间 · Processing（编程语言） ·

2021 年 6 月 1 日

Quality Estimation for Image Captions Based on Large-scale Human Evaluations

翻译：基于大规模人类评价的图像描述质量估算

Tomer Levinboim,Ashish V. Thapliyal,Piyush Sharma,Radu Soricut

from arxiv, 10 pages, 6 figures, 3 tables. Accepted to NAACL2021. https://www.aclweb.org/anthology/2021.naacl-main.253/

Automatic image captioning has improved significantly over the last few years, but the problem is far from being solved, with state of the art models still often producing low quality captions when used in the wild. In this paper, we focus on the task of Quality Estimation (QE) for image captions, which attempts to model the caption quality from a human perspective and without access to ground-truth references, so that it can be applied at prediction time to detect low-quality captions produced on previously unseen images. For this task, we develop a human evaluation process that collects coarse-grained caption annotations from crowdsourced users, which is then used to collect a large scale dataset spanning more than 600k caption quality ratings. We then carefully validate the quality of the collected ratings and establish baseline models for this new QE task. Finally, we further collect fine-grained caption quality annotations from trained raters, and use them to demonstrate that QE models trained over the coarse ratings can effectively detect and filter out low-quality image captions, thereby improving the user experience from captioning systems.

翻译：在过去几年里,自动图像字幕的改进显著,但问题远未解决,因为最新艺术模型在野生使用时仍然经常产生低质量字幕。在本文中,我们侧重于图像字幕质量估计(QE)的任务,该任务试图从人的角度模拟字幕质量,而没有获得地面真伪参考,以便在预测时应用它来检测先前不为人知的图像上产生的低质量字幕。为此,我们开发了一个人类评估程序,收集来自众源用户的粗略字幕说明,然后用于收集覆盖600公里以上字幕质量评级的大型数据集。然后我们仔细验证所收集的评级质量,并为这一新 QE 任务建立基线模型。最后,我们进一步收集受过培训的评级员的精细的字幕质量说明,并用它们来证明经过粗劣评级培训的量化模型能够有效检测和筛选出低质量的图像字幕,从而改进用户在字幕系统解释方面的经验。

0

相关内容

图像字幕

图像字幕（Image Captioning）,是指从图像生成文本描述的过程，主要根据图像中物体和物体的动作。

最新《图像描述Image Captioning》综述论文，22页pdf220篇文献

专知会员服务

43+阅读 · 2021年7月17日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

专知会员服务

138+阅读 · 2020年4月3日

【佐治亚理工学院】大规模的视觉对话的预训练:一个简单的最先进的基线（Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline）

【佐治亚理工学院】大规模的视觉对话的预训练:一个简单的最先进的基线（Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline）

专知会员服务

5+阅读 · 2019年12月5日

【SIGIR 2019 Tutorials】有效的网络搜索在线评估（Effective Online Evaluation for Web Search）

【SIGIR 2019 Tutorials】有效的网络搜索在线评估（Effective Online Evaluation for Web Search）

专知会员服务

4+阅读 · 2019年11月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【CV+NLP】更有智慧的眼睛：图像描述（Image Caption）&视觉问答（VQA）综述（上）

【CV+NLP】更有智慧的眼睛：图像描述（Image Caption）&视觉问答（VQA）综述（上）

极市平台

4+阅读 · 2019年1月20日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Image Captioning 36页最新综述， 161篇参考文献

Image Captioning 36页最新综述， 161篇参考文献

专知

90+阅读 · 2018年10月23日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标

【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标

专知

11+阅读 · 2018年3月20日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

基于深度学习的医疗影像论文汇总（Deep Learning Papers on Medical Image Analysis）

基于深度学习的医疗影像论文汇总（Deep Learning Papers on Medical Image Analysis）

AI研习社

17+阅读 · 2017年10月21日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Text-to-Image Synthesis Based on Machine Generated Captions

Text-to-Image Synthesis Based on Machine Generated Captions

Arxiv

3+阅读 · 2019年10月9日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Nocaps: novel object captioning at scale

Nocaps: novel object captioning at scale

Arxiv

6+阅读 · 2018年12月20日

Entity-aware Image Caption Generation

Arxiv

4+阅读 · 2018年11月7日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

Fluency-Guided Cross-Lingual Image Captioning

Arxiv

3+阅读 · 2017年8月15日

VIP会员

文章信息

相关主题

估计/估计量

张成子空间

Processing（编程语言）

相关VIP内容

最新《图像描述Image Captioning》综述论文，22页pdf220篇文献

专知会员服务

43+阅读 · 2021年7月17日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

【经典书】深度学习，532页pdf，Deep Learning - A Practitioner's Approach

专知会员服务

138+阅读 · 2020年4月3日

【佐治亚理工学院】大规模的视觉对话的预训练:一个简单的最先进的基线（Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline）

【佐治亚理工学院】大规模的视觉对话的预训练:一个简单的最先进的基线（Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline）

专知会员服务

5+阅读 · 2019年12月5日

【SIGIR 2019 Tutorials】有效的网络搜索在线评估（Effective Online Evaluation for Web Search）

【SIGIR 2019 Tutorials】有效的网络搜索在线评估（Effective Online Evaluation for Web Search）

专知会员服务

4+阅读 · 2019年11月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

【CV+NLP】更有智慧的眼睛：图像描述（Image Caption）&视觉问答（VQA）综述（上）

【CV+NLP】更有智慧的眼睛：图像描述（Image Caption）&视觉问答（VQA）综述（上）

极市平台

4+阅读 · 2019年1月20日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Image Captioning 36页最新综述， 161篇参考文献

Image Captioning 36页最新综述， 161篇参考文献

专知

90+阅读 · 2018年10月23日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标

【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标

专知

11+阅读 · 2018年3月20日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

基于深度学习的医疗影像论文汇总（Deep Learning Papers on Medical Image Analysis）

基于深度学习的医疗影像论文汇总（Deep Learning Papers on Medical Image Analysis）

AI研习社

17+阅读 · 2017年10月21日

相关论文

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Text-to-Image Synthesis Based on Machine Generated Captions

Text-to-Image Synthesis Based on Machine Generated Captions

Arxiv

3+阅读 · 2019年10月9日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

Nocaps: novel object captioning at scale

Nocaps: novel object captioning at scale

Arxiv

6+阅读 · 2018年12月20日

Entity-aware Image Caption Generation

Arxiv

4+阅读 · 2018年11月7日

Image Captioning based on Deep Reinforcement Learning

Image Captioning based on Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年9月13日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

Fluency-Guided Cross-Lingual Image Captioning

Arxiv

3+阅读 · 2017年8月15日

微信扫码咨询专知VIP会员