学习鲁棒的视觉-语义嵌入以获取具有推广能力的人员再识别 (Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification) - 专知论文

会员服务 ·

0

鲁棒 · 等效 · Re-ID · 模态 · 嵌入 ·

2023 年 4 月 19 日

Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification

翻译：学习鲁棒的视觉-语义嵌入以获取具有推广能力的人员再识别

Suncheng Xiang,Jingsheng Gao,Mengyuan Guan,Jiacheng Ruan,Chengfeng Zhou,Ting Liu,Dahong Qian,Yuzhuo Fu

Generalizable person re-identification (Re-ID) is a very hot research topic in machine learning and computer vision, which plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. However, previous methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training, which easily leads to poor generalization capability when adapted to the new domain. In this paper, we propose a Multi-Modal Equivalent Transformer called MMET for more robust visual-semantic embedding learning on visual, textual and visual-textual tasks respectively. To further enhance the robust feature learning in the context of transformer, a dynamic masking mechanism called Masked Multimodal Modeling strategy (MMM) is introduced to mask both the image patches and the text tokens, which can jointly works on multimodal or unimodal data and significantly boost the performance of generalizable person Re-ID. Extensive experiments on benchmark datasets demonstrate the competitive performance of our method over previous approaches. We hope this method could advance the research towards visual-semantic representation learning. Our source code is also publicly available at https://github.com/JeremyXSC/MMET.

翻译：具有推广能力的人员再识别（Re-ID）是机器学习和计算机视觉领域中非常热门的研究课题之一，由于其在公共安全和视频监控等各种应用中发挥着重要作用，因此具有极高的实际价值。然而，早期的方法主要关注视觉表征学习，而忽略了探索训练中语义特征的潜力，这很容易导致在其他领域应用时的败笔。在本文中，我们提出了一种名为多模态等效变压器（MMET）的多模态等效变压器，可分别应用于视觉、文本和视觉-文本任务，以学习更为鲁棒的视觉-语义嵌入。为了进一步加强Transformer中的鲁棒特征学习，引入了一种名为 Masked Multimodal Modeling（MMM）的动态屏蔽机制，可屏蔽图像补丁和文本标记，其适用于多模态或单模态数据，从而显着提高可推广的人员Re-ID性能。在基准数据集上的广泛实验表明，相较于先前的方法，我们的方法具有竞争力的性能。我们希望这种方法可以推进视觉-语义表示学习的研究。我们的源代码也可以在https://github.com/JeremyXSC/MMET公开获取。

0

相关内容

【NUS博士论文】学习视觉场景的结构化表示，137页pdf

【NUS博士论文】学习视觉场景的结构化表示，137页pdf

专知会员服务

38+阅读 · 2022年7月15日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【CVPR2020-杭州电子科技大学】软化相似性学习的无监督行人重识别，Unsupervised Person Re-identification via Softened Similarity Learning

【CVPR2020-杭州电子科技大学】软化相似性学习的无监督行人重识别，Unsupervised Person Re-identification via Softened Similarity Learning

专知会员服务

23+阅读 · 2020年4月8日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【斯坦福大学-ICLR2020】图神经网络预训练的策略，Strategies for Pre-training Graph Neural Networks

【斯坦福大学-ICLR2020】图神经网络预训练的策略，Strategies for Pre-training Graph Neural Networks

专知会员服务

78+阅读 · 2020年3月1日

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

专知会员服务

57+阅读 · 2019年11月23日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

专知

17+阅读 · 2018年4月19日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

面向生物特征识别的鲁棒判别结构化特征表示方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

模糊和畸变场景图像中的文字识别研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于人体姿态表示的动作识别方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于稀疏编码模型的深层学习神经网络

国家自然科学基金

7+阅读 · 2012年12月31日

退化图像不变性识别研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉显著性结构的特征提取和图像检索

国家自然科学基金

0+阅读 · 2012年12月31日

基于曲面柔韧度的三维形状局部特征描述符研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于非局部信息的图像恢复和图像质量评价研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉认知的图像不变特征提取

国家自然科学基金

0+阅读 · 2011年12月31日

面向海量图像高速拷贝检测的视觉指纹提取与匹配

国家自然科学基金

0+阅读 · 2010年12月31日

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

Arxiv

0+阅读 · 2023年6月5日

End-to-end Knowledge Retrieval with Multi-modal Queries

Arxiv

0+阅读 · 2023年6月1日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

Multi-Modal Knowledge Graph Construction and Application: A Survey

Arxiv

79+阅读 · 2022年2月11日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Deep Graph Structure Learning for Robust Representations: A Survey

Arxiv

21+阅读 · 2021年3月4日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Arxiv

10+阅读 · 2018年8月29日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

VIP会员

文章信息

相关主题

相关VIP内容

【NUS博士论文】学习视觉场景的结构化表示，137页pdf

【NUS博士论文】学习视觉场景的结构化表示，137页pdf

专知会员服务

38+阅读 · 2022年7月15日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

【CVPR2020-杭州电子科技大学】软化相似性学习的无监督行人重识别，Unsupervised Person Re-identification via Softened Similarity Learning

【CVPR2020-杭州电子科技大学】软化相似性学习的无监督行人重识别，Unsupervised Person Re-identification via Softened Similarity Learning

专知会员服务

23+阅读 · 2020年4月8日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【斯坦福大学-ICLR2020】图神经网络预训练的策略，Strategies for Pre-training Graph Neural Networks

【斯坦福大学-ICLR2020】图神经网络预训练的策略，Strategies for Pre-training Graph Neural Networks

专知会员服务

78+阅读 · 2020年3月1日

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

专知会员服务

57+阅读 · 2019年11月23日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

【论文推荐】最新七篇视觉问答（VQA）相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

专知

17+阅读 · 2018年4月19日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

相关论文

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

Arxiv

0+阅读 · 2023年6月5日

End-to-end Knowledge Retrieval with Multi-modal Queries

Arxiv

0+阅读 · 2023年6月1日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

Multi-Modal Knowledge Graph Construction and Application: A Survey

Arxiv

79+阅读 · 2022年2月11日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Deep Graph Structure Learning for Robust Representations: A Survey

Arxiv

21+阅读 · 2021年3月4日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Arxiv

10+阅读 · 2018年8月29日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

相关基金

面向生物特征识别的鲁棒判别结构化特征表示方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

模糊和畸变场景图像中的文字识别研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于人体姿态表示的动作识别方法研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于稀疏编码模型的深层学习神经网络

国家自然科学基金

7+阅读 · 2012年12月31日

退化图像不变性识别研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉显著性结构的特征提取和图像检索

国家自然科学基金

0+阅读 · 2012年12月31日

基于曲面柔韧度的三维形状局部特征描述符研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于非局部信息的图像恢复和图像质量评价研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉认知的图像不变特征提取

国家自然科学基金

0+阅读 · 2011年12月31日

面向海量图像高速拷贝检测的视觉指纹提取与匹配

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员