通过多模式零售修改图像文本检索器 (Revising Image-Text Retrieval via Multi-Modal Entailment) - 专知论文

会员服务 ·

0

模型评估 · MoDELS · 数据集 · Performer · 标注 ·

2022 年 9 月 1 日

Revising Image-Text Retrieval via Multi-Modal Entailment

翻译：通过多模式零售修改图像文本检索器

Xu Yan,Chunhui Ai,Ziqiang Cao,Min Cao,Sujian Li,Wenjie Li,Guohong Fu

from arxiv, 10 pages

An outstanding image-text retrieval model depends on high-quality labeled data. While the builders of existing image-text retrieval datasets strive to ensure that the caption matches the linked image, they cannot prevent a caption from fitting other images. We observe that such a many-to-many matching phenomenon is quite common in the widely-used retrieval datasets, where one caption can describe up to 178 images. These large matching-lost data not only confuse the model in training but also weaken the evaluation accuracy. Inspired by visual and textual entailment tasks, we propose a multi-modal entailment classifier to determine whether a sentence is entailed by an image plus its linked captions. Subsequently, we revise the image-text retrieval datasets by adding these entailed captions as additional weak labels of an image and develop a universal variable learning rate strategy to teach a retrieval model to distinguish the entailed captions from other negative samples. In experiments, we manually annotate an entailment-corrected image-text retrieval dataset for evaluation. The results demonstrate that the proposed entailment classifier achieves about 78% accuracy and consistently improves the performance of image-text retrieval baselines.

翻译：尚未解决的图像文本检索模型取决于高质量的标签数据。虽然现有图像文本检索数据集的构建者努力确保标题与链接的图像匹配, 但他们不能阻止标题与其他图像相匹配。我们注意到, 在广泛使用的检索数据集中, 如此多到多的匹配现象非常常见, 一个标题可以描述到 178 个图像。这些巨大的匹配丢失数据不仅混淆了培训中的模型, 而且还削弱了评估的准确性。在视觉和文字要求的任务的启发下, 我们提议一个多式的隐含分类器, 以确定一个句子是否由图像及其链接的标题所产生。随后, 我们修改图像文本检索数据集, 将这些标题添加为图像的额外弱标签, 并开发一个通用的可变速率战略, 教给一个检索模型, 以区分所需的标题与其他负面的样本。在实验中, 我们手动一个包含修正的图像文本检索数据集, 用于评估。结果表明, 拟议的生成的分类实现了大约 78% 的准确性, 并不断改进图像文本检索基线的性能。

0

相关内容

模型评估

机器学习系统设计系统评估标准

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

血清exosomes中蛋白、miRNAs、lncRNAs多指标联合用于血吸虫活虫感染诊断的可行性研究

国家自然科学基金

0+阅读 · 2016年12月31日

RanGTPase核质运输系统通过调控AIF核转移介导细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

去泛素化酶15调控Mcl1稳定性在黑色素瘤对维罗非尼耐药中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Cr3+:ABSi2O6(A=Na，K，Ca；B=Mg，Al)可调谐激光晶体的研制

国家自然科学基金

0+阅读 · 2014年12月31日

人工耳蜗植入者汉语普通话音调识别和音乐感知的试验研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

DLC-1信号通路系统介导TRAIL诱导人非小细胞肺癌细胞凋亡的研究

国家自然科学基金

0+阅读 · 2011年12月31日

NoxR在稻瘟菌致病性与发育中的作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

携Herceptin和紫杉醇纳米造影剂乳腺癌分子显影和新辅助化疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

Textual Entailment Recognition with Semantic Features from Empirical Text Representation

Arxiv

0+阅读 · 2022年10月18日

Generative Multi-hop Retrieval

Arxiv

0+阅读 · 2022年10月16日

Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation

Arxiv

0+阅读 · 2022年10月15日

LEATHER: A Framework for Learning to Generate Human-like Text in Dialogue

Arxiv

0+阅读 · 2022年10月14日

ConEntail: An Entailment-based Framework for Universal Zero and Few Shot Classification with Supervised Contrastive Pretraining

Arxiv

0+阅读 · 2022年10月14日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Arxiv

23+阅读 · 2019年11月5日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

相关论文

Textual Entailment Recognition with Semantic Features from Empirical Text Representation

Arxiv

0+阅读 · 2022年10月18日

Generative Multi-hop Retrieval

Arxiv

0+阅读 · 2022年10月16日

Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation

Arxiv

0+阅读 · 2022年10月15日

LEATHER: A Framework for Learning to Generate Human-like Text in Dialogue

Arxiv

0+阅读 · 2022年10月14日

ConEntail: An Entailment-based Framework for Universal Zero and Few Shot Classification with Supervised Contrastive Pretraining

Arxiv

0+阅读 · 2022年10月14日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Arxiv

23+阅读 · 2019年11月5日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

相关基金

血清exosomes中蛋白、miRNAs、lncRNAs多指标联合用于血吸虫活虫感染诊断的可行性研究

国家自然科学基金

0+阅读 · 2016年12月31日

RanGTPase核质运输系统通过调控AIF核转移介导细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

去泛素化酶15调控Mcl1稳定性在黑色素瘤对维罗非尼耐药中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Cr3+:ABSi2O6(A=Na，K，Ca；B=Mg，Al)可调谐激光晶体的研制

国家自然科学基金

0+阅读 · 2014年12月31日

人工耳蜗植入者汉语普通话音调识别和音乐感知的试验研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

DLC-1信号通路系统介导TRAIL诱导人非小细胞肺癌细胞凋亡的研究

国家自然科学基金

0+阅读 · 2011年12月31日

NoxR在稻瘟菌致病性与发育中的作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

携Herceptin和紫杉醇纳米造影剂乳腺癌分子显影和新辅助化疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员