SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech processing technologies by being simple, flexible, user-friendly, and well-documented. This paper describes the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel speech processing pipelines. SpeechBrain achieves competitive or state-of-the-art performance in a wide range of speech benchmarks. It also provides training recipes, pretrained models, and inference scripts for popular speech datasets, as well as tutorials which allow anyone with basic Python proficiency to familiarize themselves with speech technologies.
翻译:SpeopleBrain是一个开放源码和全在语音工具箱,目的是通过简单、灵活、方便用户和记录翔实的方式,促进神经语音处理技术的研究和开发,本文件描述了旨在支持若干共同感兴趣的任务的核心结构,使用户能够自然地怀孕、比较和分享新的语音处理管道,SpeopleBrain在广泛的语音基准中取得竞争性或最先进的表现,并提供各种培训食谱、预培训模型和流行语音数据集的推论脚本,以及使具有基本皮顿能力的人能够熟悉语音技术的辅导材料。