This paper proposes a new source model and training scheme to improve the accuracy and speed of the multichannel variational autoencoder (MVAE) method. The MVAE method is a recently proposed powerful multichannel source separation method. It consists of pretraining a source model represented by a conditional VAE (CVAE) and then estimating separation matrices along with other unknown parameters so that the log-likelihood is non-decreasing given an observed mixture signal. Although the MVAE method has been shown to provide high source separation performance, one drawback is the computational cost of the backpropagation steps in the separation-matrix estimation algorithm. To overcome this drawback, a method called "FastMVAE" was subsequently proposed, which uses an auxiliary classifier VAE (ACVAE) to train the source model. By using the classifier and encoder trained in this way, the optimal parameters of the source model can be inferred efficiently, albeit approximately, in each step of the algorithm. However, the generalization capability of the trained ACVAE source model was not satisfactory, which led to poor performance in situations with unseen data. To improve the generalization capability, this paper proposes a new model architecture (called the "ChimeraACVAE" model) and a training scheme based on knowledge distillation. The experimental results revealed that the proposed source model trained with the proposed loss function achieved better source separation performance with less computation time than FastMVAE. We also confirmed that our methods were able to separate 18 sources with a reasonably good accuracy.
翻译:本文提出了一个新的源模型和培训计划,以提高多渠道变异自动编码器(MVAE)方法的准确性和速度。MVAE方法是最近提出的一种强有力的多渠道源分离方法,它包括先培训由有条件的VAE(CVAE)代表的源模型,然后对分离矩阵和其他未知参数进行估算,以便根据观察到的混合信号,对日志相似性不减少。虽然MVAE方法显示提供了合理的源快速分解性能,但有一个缺点是分离矩阵估算算法中的后再调整步骤的计算成本。为了克服这一退步,随后提出了称为“FastMVAE”的方法,该方法使用辅助分类器VAE(ACVAE)来培训源模型,以便使用经过此方法培训的分类器和编码器,可以有效地推断出源模型的最佳参数,尽管在算法的每一步骤中都提供了合理的快速分解性性性。然而,经过训练的AVAE源模型的普及性源化能力并不令人满意,这也使得A级的分解模型能够更精确地显示我们所理解的模型。