新型音乐合成 (Novel-View Acoustic Synthesis) - 专知论文

会员服务 ·

0

Learning · Networking · 知识 (knowledge) · MoDELS · 数据集 ·

2023 年 1 月 23 日

Novel-View Acoustic Synthesis

翻译：新型音乐合成

Changan Chen,Alexander Richard,Roman Shapovalov,Vamsi Krishna Ithapu,Natalia Neverova,Kristen Grauman,Andrea Vedaldi

from arxiv, Project page: https://vision.cs.utexas.edu/projects/nvas

We introduce the novel-view acoustic synthesis (NVAS) task: given the sight and sound observed at a source viewpoint, can we synthesize the sound of that scene from an unseen target viewpoint? We propose a neural rendering approach: Visually-Guided Acoustic Synthesis (ViGAS) network that learns to synthesize the sound of an arbitrary point in space by analyzing the input audio-visual cues. To benchmark this task, we collect two first-of-their-kind large-scale multi-view audio-visual datasets, one synthetic and one real. We show that our model successfully reasons about the spatial cues and synthesizes faithful audio on both datasets. To our knowledge, this work represents the very first formulation, dataset, and approach to solve the novel-view acoustic synthesis task, which has exciting potential applications ranging from AR/VR to art and design. Unlocked by this work, we believe that the future of novel-view synthesis is in multi-modal learning from videos.

翻译：我们引入了新视觉声学合成(NVAS)任务:鉴于在源视图中观测到的视觉和声音,我们能否从看不见的目标角度综合该场景的声音?我们建议一种神经合成方法:视觉辅助声学合成(ViGAS)网络,通过分析输入的视听提示,学习合成空间任意点的声音。为衡量这一任务,我们收集了两个首个他们同类的大型多视角视听数据集,一个合成数据集和一个真实数据集。我们展示了我们关于空间提示和合成两个数据集忠实音频的模型成功的理由。据我们了解,这项工作代表了第一个设计、数据集和解决新视觉声学合成任务的方法,它具有从AR/VR到艺术和设计等令人振奋人心的潜在应用。我们认为,新视角合成的未来是从视频中多模式学习的。

0

相关内容

Learning

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

LED用碱金属碱土金属硼酸盐荧光粉的制备及其热力学性质研究

国家自然科学基金

0+阅读 · 2015年12月31日

压力浸渗法金属z-pin增强碳纤维/铝复合材料层合板的层间断裂行为及界面调控机理

国家自然科学基金

0+阅读 · 2013年12月31日

基于脆弱性的大气颗粒物重金属健康风险研究

国家自然科学基金

0+阅读 · 2013年12月31日

AGEs-MAPK-ENaCs信号通路在高血压水盐代谢中的调控作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

抗肿瘤二萜salviyunnanone的不对称全合成与构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型Re(I)配合物磷光材料的设计、合成及其光电性能研究

国家自然科学基金

1+阅读 · 2012年12月31日

石墨烯和碳纳米管弹性性质及尺度效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

分离式拟质点法的研究及裂纹尖端场的多尺度模拟

国家自然科学基金

0+阅读 · 2012年12月31日

Ti3SiC2 MAX 相薄膜的合成及其氦损伤特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

AIM和ELF理论方法及应用的新拓展

国家自然科学基金

0+阅读 · 2009年12月31日

Non-Contrastive Unsupervised Learning of Physiological Signals from Video

Arxiv

0+阅读 · 2023年3月14日

Visual-Policy Learning through Multi-Camera View to Single-Camera View Knowledge Distillation for Robot Manipulation Tasks

Arxiv

0+阅读 · 2023年3月13日

NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

Arxiv

0+阅读 · 2023年3月13日

Next-Best-View Selection for Robot Eye-in-Hand Calibration

Arxiv

0+阅读 · 2023年3月12日

Just Flip: Flipped Observation Generation and Optimization for Neural Radiance Fields to Cover Unobserved View

Arxiv

0+阅读 · 2023年3月11日

MVImgNet: A Large-scale Dataset of Multi-view Images

Arxiv

0+阅读 · 2023年3月10日

Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

Arxiv

0+阅读 · 2023年3月10日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

Few-shot acoustic event detection via meta-learning

Arxiv

26+阅读 · 2020年2月21日

mvn2vec: Preservation and Collaboration in Multi-View Network Embedding

Arxiv

10+阅读 · 2018年1月19日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

最新《扩散模型原理》新书，470页pdf

无人机作战：演进、创新与未来战场

AI 智能体简史

多模态空间推理在大模型时代：综述与基准测试

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Non-Contrastive Unsupervised Learning of Physiological Signals from Video

Arxiv

0+阅读 · 2023年3月14日

Visual-Policy Learning through Multi-Camera View to Single-Camera View Knowledge Distillation for Robot Manipulation Tasks

Arxiv

0+阅读 · 2023年3月13日

NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

Arxiv

0+阅读 · 2023年3月13日

Next-Best-View Selection for Robot Eye-in-Hand Calibration

Arxiv

0+阅读 · 2023年3月12日

Just Flip: Flipped Observation Generation and Optimization for Neural Radiance Fields to Cover Unobserved View

Arxiv

0+阅读 · 2023年3月11日

MVImgNet: A Large-scale Dataset of Multi-view Images

Arxiv

0+阅读 · 2023年3月10日

Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

Arxiv

0+阅读 · 2023年3月10日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

Few-shot acoustic event detection via meta-learning

Arxiv

26+阅读 · 2020年2月21日

mvn2vec: Preservation and Collaboration in Multi-View Network Embedding

Arxiv

10+阅读 · 2018年1月19日

相关基金

LED用碱金属碱土金属硼酸盐荧光粉的制备及其热力学性质研究

国家自然科学基金

0+阅读 · 2015年12月31日

压力浸渗法金属z-pin增强碳纤维/铝复合材料层合板的层间断裂行为及界面调控机理

国家自然科学基金

0+阅读 · 2013年12月31日

基于脆弱性的大气颗粒物重金属健康风险研究

国家自然科学基金

0+阅读 · 2013年12月31日

AGEs-MAPK-ENaCs信号通路在高血压水盐代谢中的调控作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

抗肿瘤二萜salviyunnanone的不对称全合成与构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型Re(I)配合物磷光材料的设计、合成及其光电性能研究

国家自然科学基金

1+阅读 · 2012年12月31日

石墨烯和碳纳米管弹性性质及尺度效应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

分离式拟质点法的研究及裂纹尖端场的多尺度模拟

国家自然科学基金

0+阅读 · 2012年12月31日

Ti3SiC2 MAX 相薄膜的合成及其氦损伤特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

AIM和ELF理论方法及应用的新拓展

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员