We propose a semi-supervised localization approach based on deep generative modeling with variational autoencoders (VAEs). Localization in reverberant environments remains a challenge, which machine learning (ML) has shown promise in addressing. Even with large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. We address this issue by performing semi-supervised learning (SSL) with convolutional VAEs. The VAE is trained to generate the phase of relative transfer functions (RTFs), in parallel with a DOA classifier, on both labeled and unlabeled RTF samples. The VAE-SSL approach is compared with SRP-PHAT and fully-supervised CNNs. We find that VAE-SSL can outperform both SRP-PHAT and CNN in label-limited scenarios.
翻译:我们建议采用半监督的本地化方法,其基础是采用与变异自动电解器(VAEs)的深基因模型。在回旋环境中的本地化仍然是一个挑战,机器学习(ML)已经表明解决的希望。即使数据量很大,用于在回旋环境中监督学习的标签数量通常也很小。我们通过与变异VAE进行半监督学习(SSL)来解决这一问题。VAE接受培训,在标签和无标签RTF样本上,与DOA分类器平行,生成相对转移功能的阶段。VAE-SSL方法与SRP-PHAT和完全监督的CNN做了比较。我们发现VAE-SSL在有标签的情况下可以超越SRP-PAT和CNN。我们发现VA-SSL在有标签的情景下可以超越SRP-PAT和CNN。