嵌入图像检索多模式查询的嵌入式自修 (Embedding Arithmetic of Multimodal Queries for Image Retrieval) - 专知论文

会员服务 ·

0

多峰值 · 图像检索 · 变换 · AIM · 数据集 ·

2022 年 10 月 20 日

Embedding Arithmetic of Multimodal Queries for Image Retrieval

翻译：嵌入图像检索多模式查询的嵌入式自修

Guillaume Couairon,Matthieu Cord,Matthijs Douze,Holger Schwenk

from arxiv, accepted at O-DRUM (CVPR workshop 2022)

Latent text representations exhibit geometric regularities, such as the famous analogy: queen is to king what woman is to man. Such structured semantic relations were not demonstrated on image representations. Recent works aiming at bridging this semantic gap embed images and text into a multimodal space, enabling the transfer of text-defined transformations to the image modality. We introduce the SIMAT dataset to evaluate the task of Image Retrieval with Multimodal queries. SIMAT contains 6k images and 18k textual transformation queries that aim at either replacing scene elements or changing pairwise relationships between scene elements. The goal is to retrieve an image consistent with the (source image, text transformation) query. We use an image/text matching oracle (OSCAR) to assess whether the image transformation is successful. The SIMAT dataset will be publicly available. We use SIMAT to evaluate the geometric properties of multimodal embedding spaces trained with an image/text matching objective, like CLIP. We show that vanilla CLIP embeddings are not very well suited to transform images with delta vectors, but that a simple finetuning on the COCO dataset can bring dramatic improvements. We also study whether it is beneficial to leverage pretrained universal sentence encoders (FastText, LASER and LaBSE).

翻译：隐性文本显示显示几何规律性, 比如著名的类比 : Qen是女性的王。这种结构化的语义关系没有在图像显示中表现出来。近期旨在将语义差距嵌入图像和文本到多式空间的工程, 能够将文本定义的转换转换转换成图像模式。我们引入 SIMAT 数据集来评估图像检索校正和多式查询的任务。 SIMAT 包含 6 k 图像和 18 k 文本转换查询, 目的是替换场景元素或改变场景元素之间的对称关系。目标是检索与图像( 源图像、文本转换) 查询一致的图像。我们使用图像/ 文本匹配器( OSCAR) 来评估图像转换成功与否。 SIMAT 数据集将会被公诸于众。我们使用 SIMAT 来评估以图像/ 文本匹配目标培训的多式嵌入空间的几何特性。 SIMAT 。我们显示 Vanilla CLIP 嵌入不是非常适合将图像转换成三角矢控器的图像, 但是我们使用一个简单的图像修正了 CDBSER 。。。是否具有。

0

相关内容

多峰值

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

高阶图像去噪模型的快速数值算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

李超代数中若干问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

地基InSAR高边坡三维变形提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

由圈空间和键空间所构成的张量的若干问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

PGC-1α在糖尿病肾病发病中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

统计模型中的若干组合问题

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

算子代数上的映射及与群SL(2,R)相关的vN代数

国家自然科学基金

0+阅读 · 2008年12月31日

Task-Specific Embeddings for Ante-Hoc Explainable Text Classification

Arxiv

0+阅读 · 2022年11月30日

Improving Cross-Modal Retrieval with Set of Diverse Embeddings

Arxiv

0+阅读 · 2022年11月30日

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Arxiv

0+阅读 · 2022年11月29日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction

Arxiv

18+阅读 · 2019年12月25日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Task-Specific Embeddings for Ante-Hoc Explainable Text Classification

Arxiv

0+阅读 · 2022年11月30日

Improving Cross-Modal Retrieval with Set of Diverse Embeddings

Arxiv

0+阅读 · 2022年11月30日

Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries

Arxiv

0+阅读 · 2022年11月29日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction

Arxiv

18+阅读 · 2019年12月25日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

相关基金

高阶图像去噪模型的快速数值算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

李超代数中若干问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

地基InSAR高边坡三维变形提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

由圈空间和键空间所构成的张量的若干问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

PGC-1α在糖尿病肾病发病中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

统计模型中的若干组合问题

国家自然科学基金

0+阅读 · 2011年12月31日

一种适用于高维问题的Co-kriging代理模型新方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

算子代数上的映射及与群SL(2,R)相关的vN代数

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员