While the performance of offline neural speech separation systems has been greatly advanced by the recent development of novel neural network architectures, there is typically an inevitable performance gap between the systems and their online variants. In this paper, we investigate how RNN-based offline neural speech separation systems can be changed into their online counterparts while mitigating the performance degradation. We decompose or reorganize the forward and backward RNN layers in a bidirectional RNN layer to form an online path and an offline path, which enables the model to perform both online and offline processing with a same set of model parameters. We further introduce two training strategies for improving the online model via either a pretrained offline model or a multitask training objective. Experiment results show that compared to the online models that are trained from scratch, the proposed layer decomposition and reorganization schemes and training strategies can effectively mitigate the performance gap between two RNN-based offline separation models and their online variants.
翻译:虽然最近新神经网络结构的发展大大促进了离线神经语音分离系统的性能,但是这些系统及其在线变体之间通常不可避免地存在着一种性能差距。在本文件中,我们调查如何将基于RNN的离线神经语音分离系统转变为在线系统,同时减轻性能退化。我们分解或重组前端和后端RNN的层,形成双向RNN层,形成在线路径和离线路径,使模型能够以一套相同的示范参数进行在线和离线处理。我们进一步引入了两个培训战略,通过预先培训的离线模型或多任务培训目标来改进在线模型。实验结果显示,与从零开始培训的在线模型相比,拟议的层分解和重组计划以及培训战略可以有效地缩小两个基于RNN网的离线式模型及其在线变体之间的性能差距。