The third instalment of the VoxCeleb Speaker Recognition Challenge was held in conjunction with Interspeech 2021. The aim of this challenge was to assess how well current speaker recognition technology is able to diarise and recognise speakers in unconstrained or `in the wild' data. The challenge consisted of: (i) the provision of publicly available speaker recognition and diarisation data from YouTube videos together with ground truth annotation and standardised evaluation software; and (ii) a virtual public challenge and workshop held at Interspeech 2021. This paper outlines the challenge, and describes the baselines, methods and results. We conclude with a discussion on the new multi-lingual focus of VoxSRC 2021, and on the progression of the challenge since the previous two editions.
翻译:VoxCeleb发言人承认挑战第三批是与Interspeech 2021年联合举行的。这项挑战的目的是评估当前语音识别技术在不受限制或“野生”数据方面能够对发言者进行分辨和表彰的程度,挑战包括:(一) 提供公开提供的YouTube视频中的语音识别和分解数据以及地面真相说明和标准化评价软件;(二) 在Interspeech 2021年联合举行的虚拟公众挑战和讲习班。本文概述了挑战,并描述了基线、方法和结果。我们最后讨论了VoxSRC 2021年新的多语种重点,以及前两版以来挑战的进展。