Significant progress has recently been made in speaker diarisation after the introduction of d-vectors as speaker embeddings extracted from neural network (NN) speaker classifiers for clustering speech segments. To extract better-performing and more robust speaker embeddings, this paper proposes a c-vector method by combining multiple sets of complementary d-vectors derived from systems with different NN components. Three structures are used to implement the c-vectors, namely 2D self-attentive, gated additive, and bilinear pooling structures, relying on attention mechanisms, a gating mechanism, and a low-rank bilinear pooling mechanism respectively. Furthermore, a neural-based single-pass speaker diarisation pipeline is also proposed in this paper, which uses NNs to achieve voice activity detection, speaker change point detection, and speaker embedding extraction. Experiments and detailed analyses are conducted on the challenging AMI and NIST RT05 datasets which consist of real meetings with 4--10 speakers and a wide range of acoustic conditions. For systems trained on the AMI training set, relative speaker error rate (SER) reductions of 13% and 29% are obtained by using c-vectors instead of d-vectors on the AMI dev and eval sets respectively, and a relative reduction of 15% in SER is observed on RT05, which shows the robustness of the proposed methods. By incorporating VoxCeleb data into the training set, the best c-vector system achieved 7%, 17% and16% relative SER reduction compared to the d-vector on the AMI dev, eval, and RT05 sets respectively
翻译:引入 d- 矢量器后, 语音分解最近取得了显著进展。 引入 d- 矢量器后, D- 矢量器的分解, 分别依靠神经网络、 gate- 机制、 低声双线集合机制来嵌入 。 此外, 本文还提出一个基于神经的单通声器分解管道, 利用 Nones 组合多种来自带有不同 NN 组件的系统的补充 d- 矢量器。 使用三个结构来实施c- 矢量器, 即 2D 自我注意、 门式添加和双线集合结构, 分别依靠注意机制、 格- 机制, 以及低声量双线集合机制 。 此外, 本文还提出了一种基于神经的单通声器单通分解分解管道, 使用 NNes 组合器来进行语音活动检测、 扬声器变换点检测和 扩音器提取。 对具有挑战的 AMI 和 NIT RT05 数据集进行了实验和详细分析, 由4- 10 个发言者和 广泛音响条件组成的真正会议组成。 在 AMI 训练系统上培训的系统上, 相对扬言器变速器变调率( 递减13% 和递减电子系统) 和递减电子系统, 递减, 递减为15- 和递减为15- d- b,, 递减 节制为 递减 递减, 递减 递减 递减 降为15- 节算为 降为, 降为 降为 降为 降为 降为