用于零热多标签文本分类的元元数据引导的零热多标签文本差异性学习 (Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification) - 专知论文

会员服务 ·

0

contrastive · 对比学习 · 文本分类 · 标注 · 学成 ·

2022 年 2 月 11 日

Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification

翻译：用于零热多标签文本分类的元元数据引导的零热多标签文本差异性学习

Yu Zhang,Zhihong Shen,Chieh-Han Wu,Boya Xie,Junheng Hao,Ye-Yi Wang,Kuansan Wang,Jiawei Han

from arxiv, 12 pages; Accepted to WWW 2022

Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant labels from a large candidate set. Most existing LMTC approaches rely on massive human-annotated training data, which are often costly to obtain and suffer from a long-tailed label distribution (i.e., many labels occur only a few times in the training set). In this paper, we study LMTC under the zero-shot setting, which does not require any annotated documents with labels and only relies on label surface names and descriptions. To train a classifier that calculates the similarity score between a document and a label, we propose a novel metadata-induced contrastive learning (MICoL) method. Different from previous text-based contrastive learning techniques, MICoL exploits document metadata (e.g., authors, venues, and references of research papers), which are widely available on the Web, to derive similar document-document pairs. Experimental results on two large-scale datasets show that: (1) MICoL significantly outperforms strong zero-shot text classification and contrastive learning baselines; (2) MICoL is on par with the state-of-the-art supervised metadata-aware LMTC method trained on 10K-200K labeled documents; and (3) MICoL tends to predict more infrequent labels than supervised methods, thus alleviates the deteriorated performance on long-tailed labels.

翻译：大型多标签文本分类(LMTC)的目的是将文件与其来自大型候选数据集的相关标签联系起来。大多数现有的LMTC方法都依赖大量人文附加说明的培训数据,这些数据往往要花费昂贵才能获得并受到长尾标签分发(例如,许多标签只在培训组中出现过几次)的影响。在本文中,我们在零点设置下研究LMTC,它不需要带有标签的任何附加说明的文件,而只依赖标签表面名称和描述。为了培训一个计算文件和标签之间相似度分的分类员,我们建议采用新的由元数据引发的对比学习(MICOL)方法。不同于以往基于文本的对比学习技术,MICOL利用文件元数据(例如,作者、地点和参考研究文件),在网上广泛提供类似的文档配对。两个大型数据集的实验结果显示:(1) MICol 明显超越了强的零点文本分类和对比学习基线;(2) MIC-MT-S-C-Sqrentral-rolex-lax-lax-lax-s lax-tracis lax-tracal-rmal-reval-rodal-leval-leval-lation-lation-leval-leval-lation-lation-lational-lational-lation-lational-lational-lational-lation-l-l-l-l-lation-lation-s-s-s-lation-lation-lation-s-s-lxxxxx-lation-lation-lation-lation-l-l-lation-lation-l-l-l-lxxxx-s-s-s-s-s-s-s-lxxxxxxxx-s-s-s-s-lxxxxxxxx-s-s-s-lxxxxxx-s-s-s-s-s-s-s-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-

0

相关内容

contrastive

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Zero-Shot Learning相关资源大列表

Zero-Shot Learning相关资源大列表

专知

52+阅读 · 2019年1月1日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

读书报告 | Deep Learning for Extreme Multi-label Text Classification

读书报告 | Deep Learning for Extreme Multi-label Text Classification

科技创新与创业

48+阅读 · 2018年1月10日

精神分裂症易感因子ErbB4对篮状细胞和吊灯状细胞神经环路发育的调控和机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于多视图协同训练的高光谱遥感影像分类

国家自然科学基金

3+阅读 · 2014年12月31日

移动政务价值的扩展对政府公共信任的影响机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

土地利用结构及其变化与流动人口关系实证研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于互动生态补偿机制的不确定性水资源管理方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

云计算环境下的可信服务组合及运行保障研究

国家自然科学基金

0+阅读 · 2012年12月31日

黄东海沉积物细菌的分类及多样性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

SAR图像二次成像

国家自然科学基金

5+阅读 · 2008年12月31日

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Self-supervised Learning for Sonar Image Classification

Arxiv

0+阅读 · 2022年4月20日

Supervised Contrastive Learning for Recommendation

Arxiv

0+阅读 · 2022年4月19日

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Arxiv

0+阅读 · 2022年4月18日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

OntoZSL: Ontology-enhanced Zero-shot Learning

Arxiv

17+阅读 · 2021年2月15日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Zero-Shot Learning相关资源大列表

Zero-Shot Learning相关资源大列表

专知

52+阅读 · 2019年1月1日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

读书报告 | Deep Learning for Extreme Multi-label Text Classification

读书报告 | Deep Learning for Extreme Multi-label Text Classification

科技创新与创业

48+阅读 · 2018年1月10日

相关论文

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Self-supervised Learning for Sonar Image Classification

Arxiv

0+阅读 · 2022年4月20日

Supervised Contrastive Learning for Recommendation

Arxiv

0+阅读 · 2022年4月19日

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Arxiv

0+阅读 · 2022年4月18日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

OntoZSL: Ontology-enhanced Zero-shot Learning

Arxiv

17+阅读 · 2021年2月15日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

精神分裂症易感因子ErbB4对篮状细胞和吊灯状细胞神经环路发育的调控和机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于多视图协同训练的高光谱遥感影像分类

国家自然科学基金

3+阅读 · 2014年12月31日

移动政务价值的扩展对政府公共信任的影响机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

土地利用结构及其变化与流动人口关系实证研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于互动生态补偿机制的不确定性水资源管理方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

云计算环境下的可信服务组合及运行保障研究

国家自然科学基金

0+阅读 · 2012年12月31日

黄东海沉积物细菌的分类及多样性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

SAR图像二次成像

国家自然科学基金

5+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员