PaddleSpeech is an open-source all-in-one speech toolkit. It aims at facilitating the development and research of speech processing technologies by providing an easy-to-use command-line interface and a simple code structure. This paper describes the design philosophy and core architecture of PaddleSpeech to support several essential speech-to-text and text-to-speech tasks. PaddleSpeech achieves competitive or state-of-the-art performance on various speech datasets and implements the most popular methods. It also provides recipes and pretrained models to quickly reproduce the experimental results in this paper. PaddleSpeech is publicly avaiable at https://github.com/PaddlePaddle/PaddleSpeech.
翻译:PadleSpeech是一个开放源码全在语音工具箱,目的是通过提供易于使用的指令线界面和一个简单的代码结构,促进语音处理技术的开发和研究。本文描述了PaddleSpeech的设计哲学和核心结构,以支持几项基本的语音到文字和文本到语音任务。PaddleSpeech在各种语音数据集上实现了竞争性或最先进的表现,并采用了最受欢迎的方法。它还提供了快速复制本文实验结果的食谱和预先培训的模型。PaddleSpeech可以在https://github.com/PaddlePaddle/PadleSpeech上公开发表。