使用图像嵌入的零热音频分类 (Zero-Shot Audio Classification using Image Embeddings) - 专知论文

会员服务 ·

0

Performer · INFORMS · Learning · Projection · 相似度 ·

2022 年 6 月 10 日

Zero-Shot Audio Classification using Image Embeddings

翻译：使用图像嵌入的零热音频分类

Duygu Dogan,Huang Xie,Toni Heittola,Tuomas Virtanen

from arxiv, Accepted to the European Signal Processing Conference (EUSIPCO) 2022

Supervised learning methods can solve the given problem in the presence of a large set of labeled data. However, the acquisition of a dataset covering all the target classes typically requires manual labeling which is expensive and time-consuming. Zero-shot learning models are capable of classifying the unseen concepts by utilizing their semantic information. The present study introduces image embeddings as side information on zero-shot audio classification by using a nonlinear acoustic-semantic projection. We extract the semantic image representations from the Open Images dataset and evaluate the performance of the models on an audio subset of AudioSet using semantic information in different domains; image, audio, and textual. We demonstrate that the image embeddings can be used as semantic information to perform zero-shot audio classification. The experimental results show that the image and textual embeddings display similar performance both individually and together. We additionally calculate the semantic acoustic embeddings from the test samples to provide an upper limit to the performance. The results show that the classification performance is highly sensitive to the semantic relation between test and training classes and textual and image embeddings can reach up to the semantic acoustic embeddings when the seen and unseen classes are semantically similar.

翻译：受监督的学习方法可以在存在大量标签数据的情况下解决给定的问题。然而, 获取覆盖所有目标类的数据集通常需要人工标签, 成本昂贵且耗时。零点学习模型能够使用语义信息对未知概念进行分类。本研究采用非线性声学- 语义学投影, 将图像嵌入作为零点音频分类的侧边信息。我们从开放图像数据集中提取语义图像显示, 并评估AudioSet音频子组的性能, 使用不同域的语义信息; 图像、音频和文本信息。我们证明图像嵌入可用作语义信息, 以进行零点音频音频分类。实验结果显示, 图像和文字嵌入在单个和文本嵌入中都表现相似。我们从测试样品中进一步计算语义性声学嵌入, 以提供性能的上限。结果显示, 分类性能对于测试、培训类和文本嵌入阶段之间的语义关系非常敏感, 和感官和图像嵌入时, 可以到达Syal- 嵌入。

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

脱硫海水中溶解态气态汞在排放海域的生成机制及影响因素研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

益气活血法调控动脉粥样硬化大鼠Rho激酶信号通路的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Au1-xM'x/3DOM MOy (M' = Pd, Pt; M = Cr, Mn, Co, Fe)的可控制备及催化CO和VOC氧化的性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

并轴II-VI/IV纳米线异质结构的电子学性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

Intermedin-53在心肌肥厚中的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

survivin拮抗细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

磁性介孔结构型超细粒子负载离子液体催化制备生物柴油研究

国家自然科学基金

0+阅读 · 2009年12月31日

Incremental Few-Shot Semantic Segmentation via Embedding Adaptive-Update and Hyper-class Representation

Arxiv

0+阅读 · 2022年7月26日

Generative Extraction of Audio Classifiers for Speaker Identification

Arxiv

0+阅读 · 2022年7月26日

Contrastive Knowledge-Augmented Meta-Learning for Few-Shot Classification

Contrastive Knowledge-Augmented Meta-Learning for Few-Shot Classification

Arxiv

0+阅读 · 2022年7月25日

Learning Unsupervised Hierarchies of Audio Concepts

Arxiv

0+阅读 · 2022年7月21日

ImGAGN:Imbalanced Network Embedding via Generative Adversarial Graph Networks

Arxiv

14+阅读 · 2021年6月5日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《巡飞弹药（爆炸性无人机）威胁态势分析》最新24页报告

《军用后勤无人机：破解战场运输挑战的创新方案》

人工智能战争：以色列、伊朗与新型AI战争形态

《俄乌战争：现代战争未来的启示与经验》

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Incremental Few-Shot Semantic Segmentation via Embedding Adaptive-Update and Hyper-class Representation

Arxiv

0+阅读 · 2022年7月26日

Generative Extraction of Audio Classifiers for Speaker Identification

Arxiv

0+阅读 · 2022年7月26日

Contrastive Knowledge-Augmented Meta-Learning for Few-Shot Classification

Contrastive Knowledge-Augmented Meta-Learning for Few-Shot Classification

Arxiv

0+阅读 · 2022年7月25日

Learning Unsupervised Hierarchies of Audio Concepts

Arxiv

0+阅读 · 2022年7月21日

ImGAGN:Imbalanced Network Embedding via Generative Adversarial Graph Networks

Arxiv

14+阅读 · 2021年6月5日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

脱硫海水中溶解态气态汞在排放海域的生成机制及影响因素研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

益气活血法调控动脉粥样硬化大鼠Rho激酶信号通路的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Au1-xM'x/3DOM MOy (M' = Pd, Pt; M = Cr, Mn, Co, Fe)的可控制备及催化CO和VOC氧化的性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

并轴II-VI/IV纳米线异质结构的电子学性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

Intermedin-53在心肌肥厚中的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

survivin拮抗细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

磁性介孔结构型超细粒子负载离子液体催化制备生物柴油研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员