CLIP-ReIdent: 对比训练用于球员重新识别 (CLIP-ReIdent: Contrastive Training for Player Re-Identification) - 专知论文

会员服务 ·

0

contrastive · 可辨认的 · Performer · MoDELS · 模型可辨识性 ·

2023 年 3 月 21 日

CLIP-ReIdent: Contrastive Training for Player Re-Identification

翻译：CLIP-ReIdent: 对比训练用于球员重新识别

Konrad Habel,Fabian Deuser,Norbert Oswald

Sports analytics benefits from recent advances in machine learning providing a competitive advantage for teams or individuals. One important task in this context is the performance measurement of individual players to provide reports and log files for subsequent analysis. During sport events like basketball, this involves the re-identification of players during a match either from multiple camera viewpoints or from a single camera viewpoint at different times. In this work, we investigate whether it is possible to transfer the out-standing zero-shot performance of pre-trained CLIP models to the domain of player re-identification. For this purpose we reformulate the contrastive language-to-image pre-training approach from CLIP to a contrastive image-to-image training approach using the InfoNCE loss as training objective. Unlike previous work, our approach is entirely class-agnostic and benefits from large-scale pre-training. With a fine-tuned CLIP ViT-L/14 model we achieve 98.44 % mAP on the MMSports 2022 Player Re-Identification challenge. Furthermore we show that the CLIP Vision Transformers have already strong OCR capabilities to identify useful player features like shirt numbers in a zero-shot manner without any fine-tuning on the dataset. By applying the Score-CAM algorithm we visualise the most important image regions that our fine-tuned model identifies when calculating the similarity score between two images of a player.

翻译：体育分析受到机器学习的最新进展的影响，从而为团队或个人提供竞争优势。在这种情况下，一个重要的任务是对个人球员的表现进行测量，以提供报告和日志文件供后续分析。在篮球比赛等体育赛事中，这涉及到从多个摄像机视点或在不同时刻从单个摄像机视点重新识别球员。在这项工作中，我们研究了是否可能将预先训练的 CLIP 模型的杰出零样本性能转移到球员重新识别的领域。为此，我们将 CLIP 的对比语言到图像预训练方法重新格式化为使用 InfoNCE 损失作为训练目标的对比图像到图像训练方法。与以前的工作不同，我们的方法完全不涉及类别，并从大规模的预训练中受益。通过对精细调整的 CLIP ViT-L/14 模型进行了Fine-tune，我们在MMSports 2022 Player Re-Identification 挑战赛上实现了98.44%的mAP。此外，我们展示了 CLIP Vision Transformers 已经具有强大的 OCR 能力，可以在零样本的情况下识别出有用的球员特征，如衬衫号码，无需在数据集上进行任何精细调整。通过应用 Score-CAM 算法，我们可视化了我们精细调整的模型在计算两张球员图像之间的相似度分数时识别的最重要的图像区域。

0

相关内容

contrastive

【CVPR2022】CAT-Det:用于多模态三维物体检测的对比增强Transformer

【CVPR2022】CAT-Det:用于多模态三维物体检测的对比增强Transformer

专知会员服务

19+阅读 · 2022年4月7日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2021】自监督几何感知

【CVPR2021】自监督几何感知

专知会员服务

46+阅读 · 2021年3月6日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

用CNN分100,000类图像

用CNN分100,000类图像

极市平台

17+阅读 · 2018年1月29日

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

专知

74+阅读 · 2018年1月16日

基于氮化铝/石墨烯复合材料系统的高性能深紫外线探测器的研究

国家自然科学基金

0+阅读 · 2015年12月31日

非约束环境下人脸多属性分析的理论与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于空间-分数谱域联合稀疏表示的SAR图像目标识别

国家自然科学基金

0+阅读 · 2013年12月31日

基于表观遗传性质及调节网络对长非编码RNAs的功能注释

国家自然科学基金

0+阅读 · 2013年12月31日

无人机航空侦察图像镶嵌与目标定位关键技术研究

国家自然科学基金

8+阅读 · 2012年12月31日

网络化反馈系统中的动态量化器设计和最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

云南省单纯性室间隔缺损人群GATA4、NKX2-5和ZACN基因突变分析

国家自然科学基金

0+阅读 · 2011年12月31日

痕量持久性有机污染物快速检测的光电流传感研究

国家自然科学基金

0+阅读 · 2011年12月31日

群代数的双曲模判别及应用

国家自然科学基金

0+阅读 · 2011年12月31日

超高性能水泥基复合材料抗多次冲击设计与动态损伤规律

国家自然科学基金

0+阅读 · 2008年12月31日

Cross-Domain Few-Shot Relation Extraction via Representation Learning and Domain Adaptation

Arxiv

0+阅读 · 2023年5月10日

Sketching the Future (STF): Applying Conditional Control Techniques to Text-to-Video Models

Arxiv

0+阅读 · 2023年5月10日

Sparse Spatial Transformers for Few-Shot Learning

Arxiv

0+阅读 · 2023年5月10日

On-device Training: A First Overview on Existing Systems

Arxiv

1+阅读 · 2023年5月9日

BARA: Efficient Incentive Mechanism with Online Reward Budget Allocation in Cross-Silo Federated Learning

Arxiv

0+阅读 · 2023年5月9日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

模型可辨识性

相关VIP内容

【CVPR2022】CAT-Det:用于多模态三维物体检测的对比增强Transformer

【CVPR2022】CAT-Det:用于多模态三维物体检测的对比增强Transformer

专知会员服务

19+阅读 · 2022年4月7日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2021】自监督几何感知

【CVPR2021】自监督几何感知

专知会员服务

46+阅读 · 2021年3月6日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《运用作战人员数字孪生与生成式人工智能预测任务成果》最新文献

2025全球人工智能展望报告：通向AGI之路，76页ppt

《概率数值计算：贝叶斯求积法与人机协作》最新博士论文

【NTU博士论文】多模态神经三维资产合成

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

用CNN分100,000类图像

用CNN分100,000类图像

极市平台

17+阅读 · 2018年1月29日

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

专知

74+阅读 · 2018年1月16日

相关论文

Cross-Domain Few-Shot Relation Extraction via Representation Learning and Domain Adaptation

Arxiv

0+阅读 · 2023年5月10日

Sketching the Future (STF): Applying Conditional Control Techniques to Text-to-Video Models

Arxiv

0+阅读 · 2023年5月10日

Sparse Spatial Transformers for Few-Shot Learning

Arxiv

0+阅读 · 2023年5月10日

On-device Training: A First Overview on Existing Systems

Arxiv

1+阅读 · 2023年5月9日

BARA: Efficient Incentive Mechanism with Online Reward Budget Allocation in Cross-Silo Federated Learning

Arxiv

0+阅读 · 2023年5月9日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

基于氮化铝/石墨烯复合材料系统的高性能深紫外线探测器的研究

国家自然科学基金

0+阅读 · 2015年12月31日

非约束环境下人脸多属性分析的理论与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于空间-分数谱域联合稀疏表示的SAR图像目标识别

国家自然科学基金

0+阅读 · 2013年12月31日

基于表观遗传性质及调节网络对长非编码RNAs的功能注释

国家自然科学基金

0+阅读 · 2013年12月31日

无人机航空侦察图像镶嵌与目标定位关键技术研究

国家自然科学基金

8+阅读 · 2012年12月31日

网络化反馈系统中的动态量化器设计和最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

云南省单纯性室间隔缺损人群GATA4、NKX2-5和ZACN基因突变分析

国家自然科学基金

0+阅读 · 2011年12月31日

痕量持久性有机污染物快速检测的光电流传感研究

国家自然科学基金

0+阅读 · 2011年12月31日

群代数的双曲模判别及应用

国家自然科学基金

0+阅读 · 2011年12月31日

超高性能水泥基复合材料抗多次冲击设计与动态损伤规律

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员