While recent research advances in speaker diarization mostly focus on improving the quality of diarization results, there is also an increasing interest in improving the efficiency of diarization systems. In this paper, we propose a multi-stage clustering strategy, that uses different clustering algorithms for input of different lengths. Specifically, a fallback clusterer is used to handle short-form inputs; a main clusterer is used to handle medium-length inputs; and a pre-clusterer is used to compress long-form inputs before they are processed by the main clusterer. Both the main clusterer and the pre-clusterer can be configured with an upper bound of the computational complexity to adapt to devices with different constraints. This multi-stage clustering strategy is critical for streaming on-device speaker diarization systems, where the budgets of CPU, memory and battery are tight.
翻译:虽然最近发言者二分化的研究进展主要集中在提高二分化结果的质量上,但人们也越来越有兴趣提高二分化系统的效率。在本文中,我们提出了一个多阶段群集战略,对不同长度的输入使用不同的群集算法。具体地说,使用一个后退群集器处理短式输入;使用一个主要群集器处理中长输入;在主群集处理之前,使用一个前集群来压缩长式输入。主集束器和前集束器都可以配置一个计算复杂度的上限,以适应有不同限制的装置。这一多阶段群集战略对于在中央集成器、内存和电池预算紧张的情况下流流流式扩音器的扩音器系统至关重要。