This report presents the system developed by the ABSP Laboratory team for the third DIHARD speech diarization challenge. Our main contribution in this work is to develop a simple and efficient solution for acoustic domain dependent speech diarization. We explore speaker embeddings for \emph{acoustic domain identification} (ADI) task. Our study reveals that i-vector based method achieves considerably better performance than x-vector based approach in the third DIHARD challenge dataset. Next, we integrate the ADI module with the diarization framework. The performance substantially improved over that of the baseline when we optimized the thresholds for agglomerative hierarchical clustering and the parameters for dimensionality reduction during scoring for individual acoustic domains. We achieved a relative improvement of $9.63\%$ and $10.64\%$ in DER for core and full conditions, respectively, for Track 1 of the DIHARD III evaluation set.
翻译:本报告介绍了ABSP实验室小组为第三次DIHARD言辞分化挑战开发的系统。我们在这方面的主要贡献是开发一个简单有效的声域依赖言语分化解决方案。我们探索了用于\ emph{ 声域域识别} (ADI) 任务的演讲者嵌入器。我们的研究显示,基于i-矢量法在第三次DIHARD挑战数据集中取得了比x-矢量法更好的性能。接下来,我们将ADI模块与分化框架结合起来。当我们优化了集聚性等级组合的阈值和在个人声域评分期间减少维度的参数时,业绩大大高于基线。我们分别为DHARD III 评估组的轨道1的核心条件和全部条件相对改进了9.63美元和10.64美元。