This paper proposes weight regularization for a faster neural vocoder. Pruning time-consuming DNN modules is a promising way to realize a real-time vocoder on a CPU (e.g. WaveRNN, LPCNet). Regularization that encourages sparsity is also effective in avoiding the quality degradation created by pruning. However, the orders of weight matrices must be contiguous in SIMD size for fast vocoding. To ensure this order, we propose explicit SIMD size aware regularization. Our proposed method reshapes a weight matrix into a tensor so that the weights are aligned by group size in advance, and then computes the group Lasso-like regularization loss. Experiments on 70% sparse subband WaveRNN show that pruning in conventional Lasso and column-wise group Lasso degrades the synthetic speech's naturalness. The vocoder with proposed regularization 1) achieves comparable naturalness to that without pruning and 2) performs meaningfully faster than other conventional vocoders using regularization.
翻译:本文建议为更快的神经蒸汽器调整重量。 节制耗时 DNN 模块是实现CPU( 如 WaveRNNN, LPCNet) 实时蒸馏器的有希望的方法。 鼓励夸度的正规化对于避免由裁剪造成的质量退化也是有效的。 但是, 重量矩阵的顺序必须连续在SIMD 大小中进行, 以便快速vocode 。 为确保这个顺序, 我们提议了明确的SIMMD 尺寸。 我们提议的方法将一个重量矩阵重塑成一个加固器, 以便按群大小提前调整重量矩阵, 然后计算像Lasso一样的组的Lasso 。 对70%的稀疏亚波段 WaveRNNN 的实验显示, 常规的Lasso 和列的Lasso 组的滑动使合成语言的自然特性发生退化。 与拟议规范1 的vocod 相比, 自然特性可以与不划线和 2 与其他常规的vocoders 相比, 更快。