Neural speech synthesis models can synthesize high quality speech but typically require a high computational complexity to do so. In previous work, we introduced LPCNet, which uses linear prediction to significantly reduce the complexity of neural synthesis. In this work, we further improve the efficiency of LPCNet -- targeting both algorithmic and computational improvements -- to make it usable on a wide variety of devices. We demonstrate an improvement in synthesis quality while operating 2.5x faster. The resulting open-source LPCNet algorithm can perform real-time neural synthesis on most existing phones and is even usable in some embedded devices.
翻译:神经语音合成模型可以合成高质量的语音,但通常需要很高的计算复杂度才能这样做。 在先前的工作中,我们引入了LPCNet, 使用线性预测来大幅降低神经合成的复杂性。 在这项工作中,我们进一步提高LPCNet的效率, 既针对算法,也针对计算改进, 使其能用于多种设备。 我们展示了合成质量的改善,同时快速运行2.5x。 由此产生的开放源LPCNet算法可以在大多数现有电话上进行实时神经合成, 甚至可用于某些嵌入设备。