改进人体物体相互作用检测的类别软件变换器网络 (Category-Aware Transformer Network for Better Human-Object Interaction Detection) - 专知论文

会员服务 ·

0

INTERACT · Better · MoDELS · Extensibility · 变换 ·

2022 年 5 月 9 日

Category-Aware Transformer Network for Better Human-Object Interaction Detection

翻译：改进人体物体相互作用检测的类别软件变换器网络

Leizhen Dong,Zhimin Li,Kunlun Xu,Zhijun Zhang,Luxin Yan,Sheng Zhong,Xu Zou

from arxiv, Accepted by CVPR2022

Human-Object Interactions (HOI) detection, which aims to localize a human and a relevant object while recognizing their interaction, is crucial for understanding a still image. Recently, transformer-based models have significantly advanced the progress of HOI detection. However, the capability of these models has not been fully explored since the Object Query of the model is always simply initialized as just zeros, which would affect the performance. In this paper, we try to study the issue of promoting transformer-based HOI detectors by initializing the Object Query with category-aware semantic information. To this end, we innovatively propose the Category-Aware Transformer Network (CATN). Specifically, the Object Query would be initialized via category priors represented by an external object detection model to yield better performance. Moreover, such category priors can be further used for enhancing the representation ability of features via the attention mechanism. We have firstly verified our idea via the Oracle experiment by initializing the Object Query with the groundtruth category information. And then extensive experiments have been conducted to show that a HOI detection model equipped with our idea outperforms the baseline by a large margin to achieve a new state-of-the-art result.

翻译：人类- 物体相互作用(HOI) 检测旨在将一个人类和相关对象本地化,同时确认其相互作用,对于理解静止图像至关重要。最近,基于变压器的模型大大推动了HOI检测的进展。然而,这些模型的能力尚未得到充分探索,因为模型的对象查询总是简单地初始化为零,这将影响性能。在本文件中,我们试图研究促进基于变压器的HOI检测的问题,方法是以具有类别识别的语义信息初始化对象查询。为此,我们创新地提议了分类软件变换网络(CATN)。具体地说,对象查询将通过外部物体探测模型的先前类别进行初始化,以产生更好的性能。此外,还可以进一步使用这些类别来通过注意机制提高特征的表达能力。我们首先通过Oracle实验,通过初始化对象查询和有类别识别特征的信息来验证我们的想法。我们随后进行了广泛的实验,以显示一个具备新基线的天体探测模型,以显示一个带有新基线的天体探测模型,从而实现一个带有新基线的天体空间模型。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

抗生素Bacillomycin D调控芽孢杆菌根表生物膜形成的分子机理

国家自然科学基金

0+阅读 · 2015年12月31日

TLRs/mROS信号通路在宿主抗乳房链球菌感染中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR信号通路调控IL-17在激素抵抗性哮喘发病中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

PGC-1α调节骨骼肌脂肪酸代谢和胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

NFκB信号通路调节巨噬细胞胆固醇平衡在尿毒症性动脉粥样硬化发病机制中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Keap1-Nrf2-ARE信号通路的活性先导化合物的发现及作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Pharicin B稳定维甲酸受体的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

多发性硬化Th17和Treg细胞失衡的miRNA调控机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Single-domain Generalization in Medical Image Segmentation via Test-time Adaptation from Shape Dictionary

Arxiv

0+阅读 · 2022年6月29日

Convolutional Neural Network Based Partial Face Detection

Arxiv

0+阅读 · 2022年6月29日

Self-Supervised Training with Autoencoders for Visual Anomaly Detection

Arxiv

0+阅读 · 2022年6月28日

SEED: Semantic Graph based Deep detection for type-4 clone

Arxiv

0+阅读 · 2022年6月28日

Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling

Arxiv

0+阅读 · 2022年6月27日

TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network

Arxiv

0+阅读 · 2022年6月26日

Video Anomaly Detection via Prediction Network with Enhanced Spatio-Temporal Memory Exchange

Arxiv

0+阅读 · 2022年6月26日

Graph-in-Graph Network for Automatic Gene Ontology Description Generation

Arxiv

0+阅读 · 2022年6月24日

Object Detection in 20 Years: A Survey

Object Detection in 20 Years: A Survey

Arxiv

48+阅读 · 2019年5月13日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Arxiv

19+阅读 · 2018年1月27日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Single-domain Generalization in Medical Image Segmentation via Test-time Adaptation from Shape Dictionary

Arxiv

0+阅读 · 2022年6月29日

Convolutional Neural Network Based Partial Face Detection

Arxiv

0+阅读 · 2022年6月29日

Self-Supervised Training with Autoencoders for Visual Anomaly Detection

Arxiv

0+阅读 · 2022年6月28日

SEED: Semantic Graph based Deep detection for type-4 clone

Arxiv

0+阅读 · 2022年6月28日

Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling

Arxiv

0+阅读 · 2022年6月27日

TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network

Arxiv

0+阅读 · 2022年6月26日

Video Anomaly Detection via Prediction Network with Enhanced Spatio-Temporal Memory Exchange

Arxiv

0+阅读 · 2022年6月26日

Graph-in-Graph Network for Automatic Gene Ontology Description Generation

Arxiv

0+阅读 · 2022年6月24日

Object Detection in 20 Years: A Survey

Object Detection in 20 Years: A Survey

Arxiv

48+阅读 · 2019年5月13日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Arxiv

19+阅读 · 2018年1月27日

相关基金

抗生素Bacillomycin D调控芽孢杆菌根表生物膜形成的分子机理

国家自然科学基金

0+阅读 · 2015年12月31日

TLRs/mROS信号通路在宿主抗乳房链球菌感染中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR信号通路调控IL-17在激素抵抗性哮喘发病中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

PGC-1α调节骨骼肌脂肪酸代谢和胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

NFκB信号通路调节巨噬细胞胆固醇平衡在尿毒症性动脉粥样硬化发病机制中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Keap1-Nrf2-ARE信号通路的活性先导化合物的发现及作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Pharicin B稳定维甲酸受体的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

多发性硬化Th17和Treg细胞失衡的miRNA调控机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员