公开词汇式物体探测,连同建议采矿和预测均匀化 (Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization) - 专知论文

会员服务 ·

0

MINE · 词表 · 知识 (knowledge) · Extensibility · 目标检测 ·

2022 年 6 月 24 日

Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization

翻译：公开词汇式物体探测,连同建议采矿和预测均匀化

Peixian Chen,Kekai Sheng,Mengdan Zhang,Yunhang Shen,Ke Li,Chunhua Shen

Open-vocabulary object detection (OVD) aims to scale up vocabulary size to detect objects of novel categories beyond the training vocabulary. Recent work resorts to the rich knowledge in pre-trained vision-language models. However, existing methods are ineffective in proposal-level vision-language alignment. Meanwhile, the models usually suffer from confidence bias toward base categories and perform worse on novel ones. To overcome the challenges, we present MEDet, a novel and effective OVD framework with proposal mining and prediction equalization. First, we design an online proposal mining to refine the inherited vision-semantic knowledge from coarse to fine, allowing for proposal-level detection-oriented feature alignment. Second, based on causal inference theory, we introduce a class-wise backdoor adjustment to reinforce the predictions on novel categories to improve the overall OVD performance. Extensive experiments on COCO and LVIS benchmarks verify the superiority of MEDet over the competing approaches in detecting objects of novel categories, e.g., 32.6% AP50 on COCO and 22.4% mask mAP on LVIS.

翻译：开放词汇对象探测(OVD)旨在扩大词汇规模,以探测培训词汇以外的新类别物体。最近的工作在经过培训的视觉语言模型中依靠丰富的知识。但是,现有的方法在建议层面的视觉语言调整方面是无效的。与此同时,模型通常对基类产生信心偏向,对新颖类别则表现更差。为了克服挑战,我们提出MEDet,这是一个创新和有效的OVD框架,提出了采矿和预测均衡的建议。首先,我们设计了一个在线采矿提案,以完善从粗皮到精细的传承的视觉-语学知识,允许建议层面的探测导向特征调整。第二,根据因果推断理论,我们引入了从阶级角度出发的后门调整,以加强对新类别作出的预测,以提高OVD的总体绩效。关于CO和LVIS基准的广泛实验核实MET优于新类别物体探测的竞争性方法,例如,CO50的AP50占32.6%,LVIS的MAP22.4%。

0

相关内容

MINE

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

功能金属有机框架的吸附位优化及其CH4-N2分离机理

国家自然科学基金

0+阅读 · 2014年12月31日

规整化多级孔沸石基催化剂对烃类大分子吸附、扩散和反应性研究

国家自然科学基金

0+阅读 · 2014年12月31日

中枢orexin能和组胺能神经系统在运动控制、运动学习和运动疾病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

松香基两亲分子在功能化溶剂中的自组装

国家自然科学基金

0+阅读 · 2013年12月31日

非线性薛定谔型方程的怪波解

国家自然科学基金

0+阅读 · 2012年12月31日

柴油机活塞振荡冷却过程射流流动与过冷沸腾传热机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效NH3-SCR催化剂的设计与合成

国家自然科学基金

0+阅读 · 2011年12月31日

MST1与GAPDH相互作用及在心肌细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

基于图域几何PDE与特征不变量的离散曲面处理

国家自然科学基金

0+阅读 · 2009年12月31日

蛋白质硝基化修饰在机械创伤致心肌细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

End-to-end Temporal Action Detection with Transformer

Arxiv

0+阅读 · 2022年8月11日

HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding

HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding

Arxiv

0+阅读 · 2022年8月11日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation

Arxiv

11+阅读 · 2021年12月9日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Arxiv

17+阅读 · 2020年3月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人水面艇的实战应用》最新42页报告

《评估人工智能在判定自卫行动之必要性与相称性中的作用》报告

人工智能代理提升战时舰船战备水平

《利用虚拟现实与增强现实技术加强海港海岸线监测》报告

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

End-to-end Temporal Action Detection with Transformer

Arxiv

0+阅读 · 2022年8月11日

HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding

HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding

Arxiv

0+阅读 · 2022年8月11日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation

Arxiv

11+阅读 · 2021年12月9日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Arxiv

17+阅读 · 2020年3月31日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

相关基金

功能金属有机框架的吸附位优化及其CH4-N2分离机理

国家自然科学基金

0+阅读 · 2014年12月31日

规整化多级孔沸石基催化剂对烃类大分子吸附、扩散和反应性研究

国家自然科学基金

0+阅读 · 2014年12月31日

中枢orexin能和组胺能神经系统在运动控制、运动学习和运动疾病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

松香基两亲分子在功能化溶剂中的自组装

国家自然科学基金

0+阅读 · 2013年12月31日

非线性薛定谔型方程的怪波解

国家自然科学基金

0+阅读 · 2012年12月31日

柴油机活塞振荡冷却过程射流流动与过冷沸腾传热机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效NH3-SCR催化剂的设计与合成

国家自然科学基金

0+阅读 · 2011年12月31日

MST1与GAPDH相互作用及在心肌细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

基于图域几何PDE与特征不变量的离散曲面处理

国家自然科学基金

0+阅读 · 2009年12月31日

蛋白质硝基化修饰在机械创伤致心肌细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员