音效模型厂:产生音频模型的综合系统架构 (Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling) - 专知论文

会员服务 ·

0

GaN · MoDELS · Integration · Networking · 值域 ·

2022 年 6 月 27 日

Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling

翻译：音效模型厂:产生音频模型的综合系统架构

Lonce Wyse,Purnima Kamath,Chitralekha Gupta

We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone. The objective of the system is to generate interactively controllable sound models given (a) a range of sounds the model should be able to synthesize, and (b) a specification of the parametric controls for navigating that space of sounds. The range of sounds is defined by a dataset provided by the designer, while the means of navigation is defined by a combination of data labels and the selection of a sub-manifold from the latent space learned by the GAN. Our proposed system takes advantage of the rich latent space of a GAN that consists of sounds that fill out the spaces ''between" real data-like sounds. This augmented data from the GAN is then used to train an RNN for its ability to respond immediately and continuously to parameter changes and to generate audio over unlimited periods of time. Furthermore, we develop a self-organizing map technique for ``smoothing" the latent space of GAN that results in perceptually smooth interpolation between audio timbres. We validate this process through user studies. The system contributes advances to the state of the art for generative sound model design that include system configuration and components for improving interpolation and the expansion of audio modeling capabilities beyond musical pitch and percussive instrument sounds into the more complex space of audio textures.

翻译：我们引入了一个新的数据驱动音频声音模型设计系统,围绕两种不同的神经网络结构,即General Adversarial 网络(GAN)和一个经常性神经网络(RNN),它利用每个系统的独特性,实现两者都无法单独解决的系统目标。该系统的目标是生成一个互动控制的音频模型(a) 该模型应能够合成一系列声音,以及(b) 用于导航声音空间的参数控制规格。声音的范围由设计师提供的数据集界定,而导航手段则通过数据标签和从GAN所学的潜在空间选择一个子磁带来界定。我们提议的系统利用了GAN的丰富潜在空间空间空间空间空间空间,其中包括在空间“之间”真实的数据类声音之间的声音。随后,GAN的增强数据被用于培训 RNN,使其能立即和持续应对参数变化,并在不定期的时间里生成音频。此外,我们开发了一种超越数据标签标签标签标签标签的导航工具,在系统内部进行系统内部智能的系统设计。我们开发了一种稳定的系统,通过系统来改进系统内部空间系统的进展。

0

相关内容

GaN

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Th细胞表达的趋化因子受体及配体基因多态性与HCV感染转归关系的研究

国家自然科学基金

0+阅读 · 2014年12月31日

脊髓损伤膀胱功能重建术后脑功能重塑机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于氧化锌微米线与银薄膜的表面等离子体Fabry-Perot微腔研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

UCP2调控Warburg效应在糖尿病足细胞损伤中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

肝移植胆道周围血管丛缺血性损伤中的MAC作用机制及对缺血型胆道病变的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

JNK对非Keap1依赖性Nrf2转录活性的调控机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

枕寰枢复合体有限元分析

国家自然科学基金

0+阅读 · 2009年12月31日

活性氧在糖尿病视网膜病变“#20195;谢记忆”#20013;的作用及意义

国家自然科学基金

0+阅读 · 2009年12月31日

Model-Free Generative Replay for Lifelong Reinforcement Learning: Application to Starcraft-2

Arxiv

0+阅读 · 2022年8月16日

HVS-Inspired Signal Degradation Network for Just Noticeable Difference Estimation

Arxiv

0+阅读 · 2022年8月16日

End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks

Arxiv

0+阅读 · 2022年8月15日

Federated Learning for Medical Applications: A Taxonomy, Current Trends, and Research Challenges

Arxiv

0+阅读 · 2022年8月12日

Comparing Baseline Shapley and Integrated Gradients for Local Explanation: Some Additional Insights

Arxiv

0+阅读 · 2022年8月12日

Structure Unbiased Adversarial Model for Medical Image Segmentation

Arxiv

0+阅读 · 2022年8月11日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

最新《扩散模型原理》新书，470页pdf

无人机作战：演进、创新与未来战场

AI 智能体简史

多模态空间推理在大模型时代：综述与基准测试

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Model-Free Generative Replay for Lifelong Reinforcement Learning: Application to Starcraft-2

Arxiv

0+阅读 · 2022年8月16日

HVS-Inspired Signal Degradation Network for Just Noticeable Difference Estimation

Arxiv

0+阅读 · 2022年8月16日

End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks

Arxiv

0+阅读 · 2022年8月15日

Federated Learning for Medical Applications: A Taxonomy, Current Trends, and Research Challenges

Arxiv

0+阅读 · 2022年8月12日

Comparing Baseline Shapley and Integrated Gradients for Local Explanation: Some Additional Insights

Arxiv

0+阅读 · 2022年8月12日

Structure Unbiased Adversarial Model for Medical Image Segmentation

Arxiv

0+阅读 · 2022年8月11日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

相关基金

Th细胞表达的趋化因子受体及配体基因多态性与HCV感染转归关系的研究

国家自然科学基金

0+阅读 · 2014年12月31日

脊髓损伤膀胱功能重建术后脑功能重塑机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于氧化锌微米线与银薄膜的表面等离子体Fabry-Perot微腔研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

UCP2调控Warburg效应在糖尿病足细胞损伤中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

肝移植胆道周围血管丛缺血性损伤中的MAC作用机制及对缺血型胆道病变的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

JNK对非Keap1依赖性Nrf2转录活性的调控机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

枕寰枢复合体有限元分析

国家自然科学基金

0+阅读 · 2009年12月31日

活性氧在糖尿病视网膜病变“#20195;谢记忆”#20013;的作用及意义

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员