二进言语创作的神经四倍变换 (Neural Fourier Shift for Binaural Speech Rendering)

We present a neural network for rendering binaural speech from given monaural audio, position, and orientation of the source. Most of the previous works have focused on synthesizing binaural speeches by conditioning the positions and orientations in the feature space of convolutional neural networks. These synthesis approaches are powerful in estimating the target binaural speeches even for in-the-wild data but are difficult to generalize for rendering the audio from out-of-distribution domains. To alleviate this, we propose Neural Fourier Shift (NFS), a novel network architecture that enables binaural speech rendering in the Fourier space. Specifically, utilizing a geometric time delay based on the distance between the source and the receiver, NFS is trained to predict the delays and scales of various early reflections. NFS is efficient in both memory and computational cost, is interpretable, and operates independently of the source domain by its design. With up to 25 times lighter memory and 6 times fewer calculations, the experimental results show that NFS outperforms the previous studies on the benchmark dataset.

翻译：我们展示了一个神经网络,用于从源头的音频、位置和方向上进行双声讲话。以前的大部分作品都侧重于通过调整进化神经网络特征空间的位置和方向来合成双声讲话。这些合成方法在估计目标双声讲话方面是强大的,即使是在瞬间数据也是如此,但很难概括用于从传播外域获取音频。为了减轻这一影响,我们提议了Neural Fourier Shift(NFS),这是一个新的网络结构,使Fourier空间能够进行双声讲话。具体地说,利用基于源与接收者距离的几何时间延迟,NFS受过培训,可以预测各种早期反射的延迟和规模。NFS在记忆和计算成本上都是高效的,是可以解释的,并且通过设计独立于源域运作。由于有25倍的记忆和6倍的计算,实验结果显示NFSFS超越了先前的基准数据集研究。

相关内容

NFS

关注 0

NFS是一种分布式文件系统协议，最初由Sun Microsystems公司开发，并于1984年发布。[1]其功能旨在允许客户端主机可以像访问本地存储一样通过网络访问服务器端文件。 NFS和其他许多协议一样，是基于开放网络运算远程过程调用（ONC RPC）协议之上的。它是一个开放、标准的RFC协议，任何人或组织都可以依据标准实现它。 >

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日