重新浏览: 将目标定位在人群中, 使用引用表达式 (RefCrowd: Grounding the Target in Crowd with Referring Expressions) - 专知论文

会员服务 ·

0

可理解性 · Vision · 相似度 · 数据集 · MINE ·

2022 年 6 月 16 日

RefCrowd: Grounding the Target in Crowd with Referring Expressions

翻译：重新浏览: 将目标定位在人群中, 使用引用表达式

Heqian Qiu,Hongliang Li,Taijin Zhao,Lanxiao Wang,Qingbo Wu,Fanman Meng

Crowd understanding has aroused the widespread interest in vision domain due to its important practical significance. Unfortunately, there is no effort to explore crowd understanding in multi-modal domain that bridges natural language and computer vision. Referring expression comprehension (REF) is such a representative multi-modal task. Current REF studies focus more on grounding the target object from multiple distinctive categories in general scenarios. It is difficult to applied to complex real-world crowd understanding. To fill this gap, we propose a new challenging dataset, called RefCrowd, which towards looking for the target person in crowd with referring expressions. It not only requires to sufficiently mine the natural language information, but also requires to carefully focus on subtle differences between the target and a crowd of persons with similar appearance, so as to realize the fine-grained mapping from language to vision. Furthermore, we propose a Fine-grained Multi-modal Attribute Contrastive Network (FMAC) to deal with REF in crowd understanding. It first decomposes the intricate visual and language features into attribute-aware multi-modal features, and then captures discriminative but robustness fine-grained attribute features to effectively distinguish these subtle differences between similar persons. The proposed method outperforms existing state-of-the-art (SoTA) methods on our RefCrowd dataset and existing REF datasets. In addition, we implement an end-to-end REF toolbox for the deeper research in multi-modal domain. Our dataset and code can be available at: \url{https://qiuheqian.github.io/datasets/refcrowd/}.

翻译：众人理解已引起人们对视觉领域的广泛兴趣, 因为它具有重要的实际意义。不幸的是, 没有努力探索多模式领域的人群理解, 以连接自然语言和计算机视觉。引用表达理解( REF) 是一个具有代表性的多模式任务。当前 REF 研究更侧重于在一般情况下将目标对象从多种不同类别定位为地面; 难以应用于复杂的现实世界人群理解。为了填补这一空白, 我们提议一个新的具有挑战性的数据集, 名为 RefCrowd, 用于在人群中寻找显示表达表达方式的对象。它不仅需要充分挖掘自然语言信息, 还需要仔细关注目标与类似外观人群之间的微妙差异, 以便实现从语言到视觉的细微绘图。此外, 我们提议了一个精细的多模式匹配网络。首先, 将复杂的视觉和语言特性转换成属性- 多种模式特征, 然后捕捉具有歧视性但坚固的深层次语言信息信息信息, 并且将我们现有的精确的当前数据格式/ 将我们现有的精确的当前数据方法转化为。

0

相关内容

可理解性

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

几类高阶非线性行波方程的精确解,分支和复杂动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

GLP-1/beta-catenin/TCF信号通路对糖尿病鼠心肌细胞凋亡的保护作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4信号通路介导DFMG抗AS作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

氢气调控苜蓿干旱胁迫耐性的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

ERK信号转导通路的时空动力学及其随机效应

国家自然科学基金

0+阅读 · 2009年12月31日

针刺调节脑梗死大鼠脑动脉收缩蛋白运动的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Learning from data in the mixed adversarial non-adversarial case: Finding the helpers and ignoring the trolls

Arxiv

0+阅读 · 2022年8月5日

Artificial Image Tampering Distorts Spatial Distribution of Texture Landmarks and Quality Characteristics

Arxiv

0+阅读 · 2022年8月4日

Present and Future of SLAM in Extreme Underground Environments

Arxiv

0+阅读 · 2022年8月2日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】通过双重平衡协同专家解决不平衡的领域增量学习问题

用于语言生成的离散扩散模型

中文版 | 融合革命：无人机与人工智能如何驱动未来战争

AI应用正当时，详解AI应用开发新范式

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Learning from data in the mixed adversarial non-adversarial case: Finding the helpers and ignoring the trolls

Arxiv

0+阅读 · 2022年8月5日

Artificial Image Tampering Distorts Spatial Distribution of Texture Landmarks and Quality Characteristics

Arxiv

0+阅读 · 2022年8月4日

Present and Future of SLAM in Extreme Underground Environments

Arxiv

0+阅读 · 2022年8月2日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

几类高阶非线性行波方程的精确解,分支和复杂动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

GLP-1/beta-catenin/TCF信号通路对糖尿病鼠心肌细胞凋亡的保护作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4信号通路介导DFMG抗AS作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

氢气调控苜蓿干旱胁迫耐性的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

ERK信号转导通路的时空动力学及其随机效应

国家自然科学基金

0+阅读 · 2009年12月31日

针刺调节脑梗死大鼠脑动脉收缩蛋白运动的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员