Finer, 见 Finer, 更多信息: 文本人检索的隐含模式一致 (See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval) - 专知论文

会员服务 ·

0

行人重识别 · Attention · Learning · Extensibility · 模态 ·

2022 年 8 月 26 日

See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval

翻译：Finer, 见 Finer, 更多信息: 文本人检索的隐含模式一致

Xiujun Shu,Wei Wen,Haoqian Wu,Keyu Chen,Yiran Song,Ruizhi Qiao,Bo Ren,Xiao Wang

from arxiv, Accepted at ECCV Workshop on Real-World Surveillance (RWS 2022)

Text-based person retrieval aims to find the query person based on a textual description. The key is to learn a common latent space mapping between visual-textual modalities. To achieve this goal, existing works employ segmentation to obtain explicitly cross-modal alignments or utilize attention to explore salient alignments. These methods have two shortcomings: 1) Labeling cross-modal alignments are time-consuming. 2) Attention methods can explore salient cross-modal alignments but may ignore some subtle and valuable pairs. To relieve these issues, we introduce an Implicit Visual-Textual (IVT) framework for text-based person retrieval. Different from previous models, IVT utilizes a single network to learn representation for both modalities, which contributes to the visual-textual interaction. To explore the fine-grained alignment, we further propose two implicit semantic alignment paradigms: multi-level alignment (MLA) and bidirectional mask modeling (BMM). The MLA module explores finer matching at sentence, phrase, and word levels, while the BMM module aims to mine \textbf{more} semantic alignments between visual and textual modalities. Extensive experiments are carried out to evaluate the proposed IVT on public datasets, i.e., CUHK-PEDES, RSTPReID, and ICFG-PEDES. Even without explicit body part alignment, our approach still achieves state-of-the-art performance. Code is available at: https://github.com/TencentYoutuResearch/PersonRetrieval-IVT.

翻译：以文字为基础的个人检索旨在根据文本描述查找查询人。关键是要在视觉- 文字模式之间学习共同的潜在空间绘图。为了实现这一目标, 现有的工作使用分解来获得明确的跨模式对齐, 或者利用注意力来探索显著的对齐。这些方法有两个缺点:(1) 标记跨模式对齐需要时间。 (2) 注意方法可以探索显著的跨模式对齐,但可能会忽略一些微妙和有价值的对称。为了缓解这些问题, 我们为基于文字的人检索引入了一个隐含的视觉- extual( IVT) 框架。不同于以前的模型, IVT使用一个单一的网络来学习两种模式的演示, 这有助于视觉- 文本互动。为了探索细化的对齐调, 我们进一步提议了两种隐含的语义调整模式: 多层次对齐( MLA) 和双向层遮掩码( BMM)。 MAL 模块在句、语系和词层对齐, 而BMM 模块的目标是在不进行文字- textflearal- IV; remartial- SIalalalalalalalalalalal ex- exal contragismal sal sal sal sal supalislations.

0

相关内容

行人重识别

行人重识别

行人重识别（Person re-identification）也称行人再识别，是利用计算机视觉技术判断图像或者视频序列中是否存在特定行人的技术。广泛被认为是一个图像检索的子问题。给定一个监控行人图像，检索跨设备下的该行人图像。旨在弥补目前固定的摄像头的视觉局限，并可与行人检测/行人跟踪技术相结合，可广泛应用于智能视频监控、智能安保等领域。由于不同摄像设备之间的差异，同时行人兼具刚性和柔性的特性，外观易受穿着、尺

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

稳定度条件与环的正则性、clean性

国家自然科学基金

0+阅读 · 2012年12月31日

大尺度耦合目标雷达散射截面近场外推方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

尖晶石铬基硫化物ACr2S4磁电效应与磁热效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

金属氧化物膜修饰电极对氨基酸光学异构体的识别

国家自然科学基金

0+阅读 · 2011年12月31日

基于稀疏表示和超图的视频事件语义分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

残余元素Sn、As、Sb在钢中异质形核与析出行为的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

强非均质铀储层中压力波动地浸的流动传质机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval

Arxiv

0+阅读 · 2022年10月17日

Correlation between Alignment-Uniformity and Performance of Dense Contrastive Representations

Arxiv

0+阅读 · 2022年10月17日

Attention-Based Audio Embeddings for Query-by-Example

Arxiv

0+阅读 · 2022年10月16日

Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations

Arxiv

0+阅读 · 2022年10月14日

Prompt-based Connective Prediction Method for Fine-grained Implicit Discourse Relation Recognition

Arxiv

0+阅读 · 2022年10月13日

Low-resource Neural Machine Translation with Cross-modal Alignment

Arxiv

0+阅读 · 2022年10月13日

Language Agnostic Multilingual Information Retrieval with Contrastive Learning

Arxiv

0+阅读 · 2022年10月12日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

行人重识别

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval

Arxiv

0+阅读 · 2022年10月17日

Correlation between Alignment-Uniformity and Performance of Dense Contrastive Representations

Arxiv

0+阅读 · 2022年10月17日

Attention-Based Audio Embeddings for Query-by-Example

Arxiv

0+阅读 · 2022年10月16日

Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations

Arxiv

0+阅读 · 2022年10月14日

Prompt-based Connective Prediction Method for Fine-grained Implicit Discourse Relation Recognition

Arxiv

0+阅读 · 2022年10月13日

Low-resource Neural Machine Translation with Cross-modal Alignment

Arxiv

0+阅读 · 2022年10月13日

Language Agnostic Multilingual Information Retrieval with Contrastive Learning

Arxiv

0+阅读 · 2022年10月12日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

相关基金

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

稳定度条件与环的正则性、clean性

国家自然科学基金

0+阅读 · 2012年12月31日

大尺度耦合目标雷达散射截面近场外推方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

尖晶石铬基硫化物ACr2S4磁电效应与磁热效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

金属氧化物膜修饰电极对氨基酸光学异构体的识别

国家自然科学基金

0+阅读 · 2011年12月31日

基于稀疏表示和超图的视频事件语义分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

残余元素Sn、As、Sb在钢中异质形核与析出行为的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

强非均质铀储层中压力波动地浸的流动传质机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员