NBC2:与修改后的窄带前导体的多通道隔开 (NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer)

This work proposes a multichannel narrow-band speech separation network. In the short-time Fourier transform (STFT) domain, the proposed network processes each frequency independently, and all frequencies use a shared network. For each frequency, the network performs end-to-end speech separation, namely taking as input the STFT coefficients of microphone signals, and predicting the separated STFT coefficients of multiple speakers. The proposed network learns to cluster the frame-wise spatial/steering vectors that belong to different speakers. It is mainly composed of three components. First, a self-attention network. Clustering of spatial vectors shares a similar principle with the self-attention mechanism in the sense of computing the similarity of vectors and then aggregating similar vectors. Second, a convolutional feed-forward network. The convolutional layers are employed for signal smoothing and reverberation processing. Third, a novel hidden-layer normalization method, i.e. group batch normalization (GBN), is especially designed for the proposed narrow-band network to maintain the distribution of hidden units over frequencies. Overall, the proposed network is named NBC2, as it is a revised version of our previous NBC (narrow-band conformer) network. Experiments show that 1) the proposed network outperforms other state-of-the-art methods by a large margin, 2) the proposed GBN improves the signal-to-distortion ratio by 3 dB, relative to other normalization methods, such as batch/layer/group normalization, 3) the proposed narrow-band network is spectrum-agnostic, as it does not learn spectral patterns, and 4) the proposed network is indeed performing frame clustering (demonstrated by the attention maps).

翻译：这项工作提出了多通道窄带语音隔离网络。在短时间的 Fleier 正常化变换(STFT) 域中, 拟议的网络将每个频率独立处理每个频率, 所有频率都使用共享的网络。对于每个频率, 网络将端对端语音分离, 即将麦克风信号的STFT系数作为输入, 并预测不同调音员的STFT系数。提议的网络学习将属于不同演讲者的框架- 空间/ 移动矢量集中在一起。它主要由三个组成部分组成。首先, 一个自控网络。空间矢量分组与自控机制共享一个相似的原则, 以计算矢量的相似性, 然后将类似的矢量聚合。其次, 一个进化的向前向网络。变动层用于信号平滑动和回响处理。第三, 一个新的隐藏层正常化方法, 即群体批量注意 (GBNNN), 组合空间矢量与自控机制共享一个类似的原则, 递增的网络, 以先前的BC2 格式向其它的网络进行升级,, 以先前的版本的BC2 学习的网络, 以先前的版本的方式显示的GBBBSLM 。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日