This report describes the submission of the DKU-DukeECE-Lenovo team to the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2021 track 4. Our system including a voice activity detection (VAD) model, a speaker embedding model, two clustering-based speaker diarization systems with different similarity measurements, two different overlapped speech detection (OSD) models, and a target-speaker voice activity detection (TS-VAD) model. Our final submission, consisting of 5 independent systems, achieves a DER of 5.07% on the challenge test set.
翻译:本报告介绍DKU-DukeECE-Lenovo小组向VoxCeleb发言人识别挑战(VoxSRC)2021轨道4提交的情况,我们的系统包括语音活动检测模型、一个语音嵌入模型、两个基于集群的语音扩音分解系统,其相似度测量方法不同、两个不同的语音检测重叠模型和一个目标语音活动检测模型。我们的最后一份提交文件由5个独立的系统组成,在挑战测试集上实现了5.07%的DER。