This paper presents a novel discriminator-constrained optimal transport network (DOTN) that performs unsupervised domain adaptation for speech enhancement (SE), which is an essential regression task in speech processing. The DOTN aims to estimate clean references of noisy speech in a target domain, by exploiting the knowledge available from the source domain. The domain shift between training and testing data has been reported to be an obstacle to learning problems in diverse fields. Although rich literature exists on unsupervised domain adaptation for classification, the methods proposed, especially in regressions, remain scarce and often depend on additional information regarding the input data. The proposed DOTN approach tactically fuses the optimal transport (OT) theory from mathematical analysis with generative adversarial frameworks, to help evaluate continuous labels in the target domain. The experimental results on two SE tasks demonstrate that by extending the classical OT formulation, our proposed DOTN outperforms previous adversarial domain adaptation frameworks in a purely unsupervised manner.
翻译:本文介绍了一个新的受歧视者限制的最佳运输网络(DATN),它为语言处理中的一项基本的回归任务,即语言增强(SE)进行不受监督的域适应,这是语言增强(SE)的一项基本任务。DATN的目的是利用源领域现有的知识,对目标领域噪音演讲的纯参考进行估计。据报告,培训和测试数据之间的领域变化是学习不同领域问题的障碍。虽然在未经监督的域适应分类方面存在丰富的文献,但所提议的方法,特别是在回归方面,仍然很少使用,而且往往取决于关于输入数据的额外信息。提议的DTN方法在战术上将数学分析中的最佳运输理论与基因化对抗框架结合起来,以帮助评价目标领域的连续标签。SE的两项任务的实验结果表明,通过扩展传统的OT配方,我们提议的DTN以完全不受监督的方式超越了先前的对抗领域适应框架。