We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised speech representation for better performance, generalization, and efficiency. The challenge builds upon the SUPERB benchmark and implements metrics to measure the computation requirements of self-supervised learning (SSL) representation and to evaluate its generalizability and performance across the diverse SUPERB tasks. The SUPERB benchmark provides comprehensive coverage of popular speech processing tasks, from speech and speaker recognition to audio generation and semantic understanding. As SSL has gained interest in the speech community and showed promising outcomes, we envision the challenge to uplevel the impact of SSL techniques by motivating more practical designs of techniques beyond task performance. We summarize the results of 14 submitted models in this paper. We also discuss the main findings from those submissions and the future directions of SSL research.
翻译:我们在2022年SLT上提出了SUPERB的挑战,其目的是学习自我监督的演讲代表制,以提高业绩、普遍化和效率;这项挑战以SUPERB的基准为基础,并采用衡量自监督学习代表制的计算要求的衡量标准,评估其普遍性和在各种SUPERB任务中的绩效;SUPERB基准全面涵盖了大众演讲处理任务,从演讲和语音承认到音频生成和语义理解;由于SSL对演讲界越来越感兴趣,并显示出有希望的结果,我们设想了如何通过促进任务执行之外技术的更实用设计来提升SSL技术的影响;我们总结了本文中提交的14个模型的结果;我们还讨论了这些提交材料的主要结论以及SSL研究的未来方向。