查询2标签:多标签分类的简单变换方法 (Query2Label: A Simple Transformer Way to Multi-Label Classification) - 专知论文

会员服务 ·

0

SimPLe · 变换 · Vision · 类标记 · Performer ·

2021 年 7 月 22 日

Query2Label: A Simple Transformer Way to Multi-Label Classification

翻译：查询2标签:多标签分类的简单变换方法

Shilong Liu,Lei Zhang,Xiao Yang,Hang Su,Jun Zhu

This paper presents a simple and effective approach to solving the multi-label classification problem. The proposed approach leverages Transformer decoders to query the existence of a class label. The use of Transformer is rooted in the need of extracting local discriminative features adaptively for different labels, which is a strongly desired property due to the existence of multiple objects in one image. The built-in cross-attention module in the Transformer decoder offers an effective way to use label embeddings as queries to probe and pool class-related features from a feature map computed by a vision backbone for subsequent binary classifications. Compared with prior works, the new framework is simple, using standard Transformers and vision backbones, and effective, consistently outperforming all previous works on five multi-label classification data sets, including MS-COCO, PASCAL VOC, NUS-WIDE, and Visual Genome. Particularly, we establish $91.3\%$ mAP on MS-COCO. We hope its compact structure, simple implementation, and superior performance serve as a strong baseline for multi-label classification tasks and future studies. The code will be available soon at https://github.com/SlongLiu/query2labels.

翻译：本文介绍了解决多标签分类问题的简单而有效的方法。提议的方法利用变换器解码器查询等级标签的存在。变换器的使用根植于需要根据不同标签的适应性地提取本地歧视特征,这是因一个图像中存在多个对象而强烈希望的属性。变换器解码器中的内在交叉注意模块提供了一种有效的方法,用标签嵌入来查询和集合由随后的二进制分类的愿景主干柱计算出来的与类别有关的特征。与以前的工作相比,新框架是简单的,使用标准的变换器和愿景主干柱,并有效、持续地超过以前关于五套多标签分类数据集的所有工作,包括MS-CO、PASAL VOC、NUS-WIDE和视觉基因组。特别是,我们在MS- CO上建立了913. $ mAP。我们希望其紧凑结构、简单的实施和高性能作为多标签分类任务和未来研究的强有力基线。代码将很快在 https://gistru/Slongu/Slongubque上公布。

3

相关内容

SimPLe

最新《并行编程》，599页pdf

专知会员服务

55+阅读 · 2021年7月21日

【经典书】应用离散结构，568页pdf

专知会员服务

84+阅读 · 2021年5月4日

【斯坦福经典书最新版】语音语言处理，653页pdf

【斯坦福经典书最新版】语音语言处理，653页pdf

专知会员服务

53+阅读 · 2021年1月1日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Arxiv

6+阅读 · 2021年8月5日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

A Universal Representation Transformer Layer for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年9月2日

Revisiting Metric Learning for Few-Shot Image Classification

Arxiv

5+阅读 · 2020年4月16日

A Baseline for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年3月1日

Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

Arxiv

3+阅读 · 2019年12月17日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

LNEMLC: Label Network Embeddings for Multi-Label Classification

Arxiv

3+阅读 · 2019年1月1日

Learning to Guide Decoding for Image Captioning

Arxiv

6+阅读 · 2018年4月3日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

相关VIP内容

最新《并行编程》，599页pdf

专知会员服务

55+阅读 · 2021年7月21日

【经典书】应用离散结构，568页pdf

专知会员服务

84+阅读 · 2021年5月4日

【斯坦福经典书最新版】语音语言处理，653页pdf

【斯坦福经典书最新版】语音语言处理，653页pdf

专知会员服务

53+阅读 · 2021年1月1日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

相关论文

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Arxiv

6+阅读 · 2021年8月5日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

A Universal Representation Transformer Layer for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年9月2日

Revisiting Metric Learning for Few-Shot Image Classification

Arxiv

5+阅读 · 2020年4月16日

A Baseline for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年3月1日

Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

Arxiv

3+阅读 · 2019年12月17日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

LNEMLC: Label Network Embeddings for Multi-Label Classification

Arxiv

3+阅读 · 2019年1月1日

Learning to Guide Decoding for Image Captioning

Arxiv

6+阅读 · 2018年4月3日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

微信扫码咨询专知VIP会员