改进基于自然语言的音频检索,通过转让学习和音频与文字增强来改善以自然语言为基础的音频检索 (Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations) - 专知论文

会员服务 ·

0

Learning · 迁移学习 · 数据增强 · 可约的 · Projection ·

2022 年 10 月 3 日

Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations

翻译：改进基于自然语言的音频检索,通过转让学习和音频与文字增强来改善以自然语言为基础的音频检索

Paul Primus,Gerhard Widmer

from arxiv, accepted at DCASE Workshop 2022

The absence of large labeled datasets remains a significant challenge in many application areas of deep learning. Researchers and practitioners typically resort to transfer learning and data augmentation to alleviate this issue. We study these strategies in the context of audio retrieval with natural language queries (Task 6b of the DCASE 2022 Challenge). Our proposed system uses pre-trained embedding models to project recordings and textual descriptions into a shared audio-caption space in which related examples from different modalities are close. We employ various data augmentation techniques on audio and text inputs and systematically tune their corresponding hyperparameters with sequential model-based optimization. Our results show that the used augmentations strategies reduce overfitting and improve retrieval performance.

翻译：在许多深层学习的应用领域,缺少大型标签数据集仍然是一项重大挑战,研究人员和从业人员通常采用转移学习和数据增强的方法来缓解这一问题。我们通过自然语言查询(DCASE 2022挑战,任务6b)进行音频检索研究这些战略。我们提议的系统使用经过预先培训的嵌入模型,将项目录音和文字描述嵌入到共享的音频插播空间,其中不同模式的相关实例接近。我们采用各种关于音频和文字投入的数据增强技术,并系统调整相应的超强参数,按顺序优化模型。我们的结果显示,所使用的扩音战略减少了超配并改进了检索性能。

0

相关内容

Learning

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Periostin调控外周组织胰岛素敏感性及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

恶性肿瘤风险lncRNA多态位点识别及其功能注释研究

国家自然科学基金

0+阅读 · 2013年12月31日

稻瘟病菌cAMP-PKA信号途径下游基因调控网络及其功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Large Language Models with Controllable Working Memory

Arxiv

0+阅读 · 2022年11月9日

Closing the Gap between Client and Global Model Performance in Heterogeneous Federated Learning

Arxiv

0+阅读 · 2022年11月7日

Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval

Arxiv

0+阅读 · 2022年11月7日

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

Arxiv

0+阅读 · 2022年11月4日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Large Language Models with Controllable Working Memory

Arxiv

0+阅读 · 2022年11月9日

Closing the Gap between Client and Global Model Performance in Heterogeneous Federated Learning

Arxiv

0+阅读 · 2022年11月7日

Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval

Arxiv

0+阅读 · 2022年11月7日

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

Arxiv

0+阅读 · 2022年11月4日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

Periostin调控外周组织胰岛素敏感性及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

恶性肿瘤风险lncRNA多态位点识别及其功能注释研究

国家自然科学基金

0+阅读 · 2013年12月31日

稻瘟病菌cAMP-PKA信号途径下游基因调控网络及其功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员