AVGZSLSNet:通过从多模式嵌入式中重建标签符文特征,进行视听通用零热学习 (AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings) - 专知论文

会员服务 ·

0

类标记 · 学成 · 类别 · 标注 · 模态 ·

2020 年 11 月 23 日

AVGZSLNet: Audio-Visual Generalized Zero-Shot Learning by Reconstructing Label Features from Multi-Modal Embeddings

翻译：AVGZSLSNet:通过从多模式嵌入式中重建标签符文特征,进行视听通用零热学习

Pratik Mazumder,Pravendra Singh,Kranti Kumar Parida,Vinay P. Namboodiri

from arxiv, Accepted in WACV 2021

In this paper, we propose a novel approach for generalized zero-shot learning in a multi-modal setting, where we have novel classes of audio/video during testing that are not seen during training. We use the semantic relatedness of text embeddings as a means for zero-shot learning by aligning audio and video embeddings with the corresponding class label text feature space. Our approach uses a cross-modal decoder and a composite triplet loss. The cross-modal decoder enforces a constraint that the class label text features can be reconstructed from the audio and video embeddings of data points. This helps the audio and video embeddings to move closer to the class label text embedding. The composite triplet loss makes use of the audio, video, and text embeddings. It helps bring the embeddings from the same class closer and push away the embeddings from different classes in a multi-modal setting. This helps the network to perform better on the multi-modal zero-shot learning task. Importantly, our multi-modal zero-shot learning approach works even if a modality is missing at test time. We test our approach on the generalized zero-shot classification and retrieval tasks and show that our approach outperforms other models in the presence of a single modality as well as in the presence of multiple modalities. We validate our approach by comparing it with previous approaches and using various ablations.

翻译：在本文中,我们提出了一种在多模式环境下普遍零点学习的新办法,在多模式环境下,我们拥有在测试期间没有看到的培训中新型的音频/视频类别。我们使用文本嵌入的语义关联性作为零点学习的手段,将音频和视频嵌入与相应的类标签文本属性空间相匹配。我们的方法使用跨模式解码器和复合三重损失。交叉模式解码器强制要求从数据点的音频和视频嵌入中重建类标签文本功能。这帮助音频和视频嵌入更接近类标签嵌入文本。复合三重损失利用了音频、视频和文本嵌入的语义。我们的方法将同一类的嵌入与相应的类的音频和视频嵌入相匹配。我们的方法使用跨模式,将不同类别嵌入的嵌入推出多模式。这有助于网络更好地完成多模式零点学习任务。我们多模式的零点学习方法可以发挥作用,即使一种模式在测试时已经丢失了。我们用一种模式来测试我们以前的版本的版本,也用另一种模式展示了我们以前的零位模式。

0

相关内容

类标记

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

专知会员服务

66+阅读 · 2020年4月17日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【AAAI2020】知识图谱的生成式对抗零样本关系学习，Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs

【AAAI2020】知识图谱的生成式对抗零样本关系学习，Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs

专知会员服务

64+阅读 · 2020年1月11日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

43+阅读 · 2019年11月12日

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

专知会员服务

13+阅读 · 2019年10月31日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

revelation of MONet

revelation of MONet

CreateAMind

5+阅读 · 2019年6月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

小样本学习（Few-shot Learning）综述

小样本学习（Few-shot Learning）综述

云栖社区

22+阅读 · 2019年4月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Arxiv

6+阅读 · 2020年10月26日

A Meta-Learning Framework for Generalized Zero-Shot Learning

A Meta-Learning Framework for Generalized Zero-Shot Learning

Arxiv

3+阅读 · 2019年9月10日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

3+阅读 · 2018年11月15日

Generative Dual Adversarial Network for Generalized Zero-shot Learning

Arxiv

7+阅读 · 2018年11月12日

Multi-Label Zero-Shot Learning with Structured Knowledge Graphs

Arxiv

7+阅读 · 2018年5月26日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

Reconstruction Network for Video Captioning

Arxiv

5+阅读 · 2018年3月30日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

A Unified approach for Conventional Zero-shot, Generalized Zero-shot and Few-shot Learning

Arxiv

4+阅读 · 2017年10月26日

VIP会员

文章信息

相关主题

相关VIP内容

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

专知会员服务

66+阅读 · 2020年4月17日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【AAAI2020】知识图谱的生成式对抗零样本关系学习，Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs

【AAAI2020】知识图谱的生成式对抗零样本关系学习，Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs

专知会员服务

64+阅读 · 2020年1月11日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

43+阅读 · 2019年11月12日

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

专知会员服务

13+阅读 · 2019年10月31日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

《商用大语言模型的升级风险管理：国家安全运用》

【伯克利博士论文】通过真实世界实践赋能机器人自主性

相关资讯

revelation of MONet

revelation of MONet

CreateAMind

5+阅读 · 2019年6月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

小样本学习（Few-shot Learning）综述

小样本学习（Few-shot Learning）综述

云栖社区

22+阅读 · 2019年4月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Arxiv

6+阅读 · 2020年10月26日

A Meta-Learning Framework for Generalized Zero-Shot Learning

A Meta-Learning Framework for Generalized Zero-Shot Learning

Arxiv

3+阅读 · 2019年9月10日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Arxiv

3+阅读 · 2018年11月15日

Generative Dual Adversarial Network for Generalized Zero-shot Learning

Arxiv

7+阅读 · 2018年11月12日

Multi-Label Zero-Shot Learning with Structured Knowledge Graphs

Arxiv

7+阅读 · 2018年5月26日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

Reconstruction Network for Video Captioning

Arxiv

5+阅读 · 2018年3月30日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

A Unified approach for Conventional Zero-shot, Generalized Zero-shot and Few-shot Learning

Arxiv

4+阅读 · 2017年10月26日

微信扫码咨询专知VIP会员