In 2022, the U.S. National Institute of Standards and Technology (NIST) conducted the latest Language Recognition Evaluation (LRE) in an ongoing series administered by NIST since 1996 to foster research in language recognition and to measure state-of-the-art technology. Similar to previous LREs, LRE22 focused on conversational telephone speech (CTS) and broadcast narrowband speech (BNBS) data. LRE22 also introduced new evaluation features, such as an emphasis on African languages, including low resource languages, and a test set consisting of segments containing between 3s and 35s of speech randomly sampled and extracted from longer recordings. A total of 21 research organizations, forming 16 teams, participated in this 3-month long evaluation and made a total of 65 valid system submissions to be evaluated. This paper presents an overview of LRE22 and an analysis of system performance over different evaluation conditions. The evaluation results suggest that Oromo and Tigrinya are easier to detect while Xhosa and Zulu are more challenging. A greater confusability is seen for some language pairs. When speech duration increased, system performance significantly increased up to a certain duration, and then a diminishing return on system performance is observed afterward.
翻译:2022年,美国国家标准和技术研究所(NIST)在自1996年以来由NIST管理的一个连续系列中进行了最新的语言识别评价(LRE),目的是促进语言识别方面的研究和衡量最新技术。与以前的LRES数据类似,LRE22侧重于对话电话语音(CTS)和广播窄带语音(BNBS)数据。LRE22还引入了新的评价特征,例如强调非洲语言,包括资源较少的语言,以及一套由3至35个部分的语音随机抽样和从较长的录音中抽取的测试。共有21个研究组织组成了16个小组,参加了为期3个月的评价,总共65份有效的系统呈件有待评估。本文概述了LRE22,分析了不同评价条件下的系统性能。评价结果表明,Oromo和Tigrinya更容易被检测,而Xhosa和Zulu则更具挑战性。一些语言配对更难理解性。当语音持续时间增加时,系统性能大大提升到一定的时间,随后逐渐下降。</s>