极端多标签文本分类的快速多分辨率变压器微调 (Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification) - 专知论文

会员服务 ·

0

Transformer模型 · 变换 · 文本分类 · MoDELS · 标记空间 ·

2021 年 10 月 1 日

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

翻译：极端多标签文本分类的快速多分辨率变压器微调

Jiong Zhang,Wei-cheng Chang,Hsiang-fu Yu,Inderjit S. Dhillon

Extreme multi-label text classification (XMC) seeks to find relevant labels from an extreme large label collection for a given text input. Many real-world applications can be formulated as XMC problems, such as recommendation systems, document tagging and semantic search. Recently, transformer based XMC methods, such as X-Transformer and LightXML, have shown significant improvement over other XMC methods. Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs. In this paper, we propose a novel recursive approach, XR-Transformer to accelerate the procedure through recursively fine-tuning transformer models on a series of multi-resolution objectives related to the original XMC objective function. Empirical results show that XR-Transformer takes significantly less training time compared to other transformer-based XMC models while yielding better state-of-the-art results. In particular, on the public Amazon-3M dataset with 3 million labels, XR-Transformer is not only 20x faster than X-Transformer but also improves the Precision@1 from 51% to 54%.

翻译：极端多标签文本分类 (XMC) 试图从一个极大的文本输入的极大标签收藏中找到相关的标签。许多真实世界应用程序可以被设计成 XMC 问题, 如建议系统、文档标记和语义搜索。最近, 以变压器为基础的 XMC 方法, 如X- Transfer 和 LightXML 等, 与其他 XMC 方法相比, 有了显著的改进。尽管在文本代表方面利用了预先训练的变压器模型模型, 大标签空间变压器模型的微调程序仍然有很长的计算时间, 即使有强大的 GPPS 。在本文中, 我们提议一种新颖的递转方法, XR- Transformination, 以通过与原 XMC 目标函数相关的一系列多分辨率目标递增调变压器变压器模型加速程序。彩色结果表明, XR- Transmext 与其它以变压器为基础的 XMC 模型相比, 需要大量的培训时间, 同时产生更好的状态- 。。特别是, 在公共 Amazon-3M 3百万标签上, XR- Transforent 不仅比 X- 5% 1 还要快20x 更快, 更快于 X- 51%

0

相关内容

Transformer模型

Transformer模型

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

3分钟看懂史上最强NLP模型BERT

3分钟看懂史上最强NLP模型BERT

新智元

23+阅读 · 2019年2月27日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation

Arxiv

0+阅读 · 2021年11月23日

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Arxiv

6+阅读 · 2021年8月5日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

End-to-End Text Classification via Image-based Embedding using Character-level Networks

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Arxiv

5+阅读 · 2018年10月10日

Universal Language Model Fine-tuning for Text Classification

Arxiv

3+阅读 · 2018年5月23日

Fine-tuned Language Models for Text Classification

Arxiv

5+阅读 · 2018年1月18日

Subset Labeled LDA for Large-Scale Multi-Label Classification

Arxiv

3+阅读 · 2017年9月16日

VIP会员

文章信息

相关主题

Transformer模型

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

3分钟看懂史上最强NLP模型BERT

3分钟看懂史上最强NLP模型BERT

新智元

23+阅读 · 2019年2月27日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation

Arxiv

0+阅读 · 2021年11月23日

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Arxiv

6+阅读 · 2021年8月5日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

End-to-End Text Classification via Image-based Embedding using Character-level Networks

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Arxiv

5+阅读 · 2018年10月10日

Universal Language Model Fine-tuning for Text Classification

Arxiv

3+阅读 · 2018年5月23日

Fine-tuned Language Models for Text Classification

Arxiv

5+阅读 · 2018年1月18日

Subset Labeled LDA for Large-Scale Multi-Label Classification

Arxiv

3+阅读 · 2017年9月16日

微信扫码咨询专知VIP会员