图像定位系统自动测试 (Automated Testing of Image Captioning Systems) - 专知论文

会员服务 ·

0

图像字幕 · Automator · 标注 · Microsoft Azure · state-of-the-art ·

2022 年 6 月 14 日

Automated Testing of Image Captioning Systems

翻译：图像定位系统自动测试

Boxi Yu,Zhiqing Zhong,Xinran Qin,Jiayi Yao,Yuancheng Wang,Pinjia He

Image captioning (IC) systems, which automatically generate a text description of the salient objects in an image (real or synthetic), have seen great progress over the past few years due to the development of deep neural networks. IC plays an indispensable role in human society, for example, labeling massive photos for scientific studies and assisting visually-impaired people in perceiving the world. However, even the top-notch IC systems, such as Microsoft Azure Cognitive Services and IBM Image Caption Generator, may return incorrect results, leading to the omission of important objects, deep misunderstanding, and threats to personal safety. To address this problem, we propose MetaIC, the \textit{first} metamorphic testing approach to validate IC systems. Our core idea is that the object names should exhibit directional changes after object insertion. Specifically, MetaIC (1) extracts objects from existing images to construct an object corpus; (2) inserts an object into an image via novel object resizing and location tuning algorithms; and (3) reports image pairs whose captions do not exhibit differences in an expected way. In our evaluation, we use MetaIC to test one widely-adopted image captioning API and five state-of-the-art (SOTA) image captioning models. Using 1,000 seeds, MetaIC successfully reports 16,825 erroneous issues with high precision (84.9\%-98.4\%). There are three kinds of errors: misclassification, omission, and incorrect quantity. We visualize the errors reported by MetaIC, which shows that flexible overlapping setting facilitates IC testing by increasing and diversifying the reported errors. In addition, MetaIC can be further generalized to detect label errors in the training dataset, which has successfully detected 151 incorrect labels in MS COCO Caption, a standard dataset in image captioning.

翻译：图像标题 (IC) 系统自动生成图像( 真实的或合成的) 突出对象的文字描述, 过去几年里, 由于深层神经网络的发展, 该系统取得了巨大的进步。 IC 在人类社会中发挥了不可或缺的作用, 例如, 为科学研究贴上大片照片, 并帮助视视障者感知世界。但是, 即使顶尖的IC系统, 如微软 Azure 识别服务和 IBM 图像描述生成器等, 也可能会返回错误的结果, 导致重要对象的遗漏、深刻的误解和对人身安全的威胁。为了解决这个问题, 我们提议MetAIIC, 将 & textit{First 变异性测试方法用于验证 IC 系统。我们的核心思想是, 将对象名称标为在对象插入后显示方向变化变化。具体来说, MetAIIC (1) 从现有图像中提取对象来构建一个天体; (2) 通过新对象调整和位置校正算算法, 以及(3) 报告图像配对, 其标题不会在预期的方式出现差异。在评估中, 我们使用MetIC 284 显示高级数据版本显示版本 5 版本。

0

相关内容

图像字幕

图像字幕（Image Captioning）,是指从图像生成文本描述的过程，主要根据图像中物体和物体的动作。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

AMPK介导自噬在急性放射性皮炎防治中的作用与机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

HIPIMS制备环境自适应C/MoSx复合润滑薄膜的界面结构调控及摩擦机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于语义分析的数据库交互技术

国家自然科学基金

0+阅读 · 2012年12月31日

基于云计算的建筑全生命期BIM集成与应用关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

SNF1/AMPK/SnRK1复合体的亚基UPS调控拟南芥花粉与柱头互作的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于Petri网灵巧信标的自动制造系统死锁控制策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-126对CD4+CD25+调节性T细胞外周诱导的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ramsey－CPT原子频标研制

国家自然科学基金

0+阅读 · 2009年12月31日

Content-oriented learned image compression

Content-oriented learned image compression

Arxiv

0+阅读 · 2022年8月1日

A Taxonomy of Prompt Modifiers for Text-To-Image Generation

Arxiv

0+阅读 · 2022年7月31日

ros2_tracing: Multipurpose Low-Overhead Framework for Real-Time Tracing of ROS 2

Arxiv

0+阅读 · 2022年7月30日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

Microsoft Azure

state-of-the-art

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

相关论文

Content-oriented learned image compression

Content-oriented learned image compression

Arxiv

0+阅读 · 2022年8月1日

A Taxonomy of Prompt Modifiers for Text-To-Image Generation

Arxiv

0+阅读 · 2022年7月31日

ros2_tracing: Multipurpose Low-Overhead Framework for Real-Time Tracing of ROS 2

Arxiv

0+阅读 · 2022年7月30日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

相关基金

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

AMPK介导自噬在急性放射性皮炎防治中的作用与机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

HIPIMS制备环境自适应C/MoSx复合润滑薄膜的界面结构调控及摩擦机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于语义分析的数据库交互技术

国家自然科学基金

0+阅读 · 2012年12月31日

基于云计算的建筑全生命期BIM集成与应用关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

SNF1/AMPK/SnRK1复合体的亚基UPS调控拟南芥花粉与柱头互作的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于Petri网灵巧信标的自动制造系统死锁控制策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-126对CD4+CD25+调节性T细胞外周诱导的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ramsey－CPT原子频标研制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员