Deep neural network based full-band speech enhancement systems face challenges of high demand of computational resources and imbalanced frequency distribution. In this paper, a light-weight full-band model is proposed with two dedicated strategies, i.e., a learnable spectral compression mapping for more effective high-band spectral information compression, and the utilization of the multi-head attention mechanism for more effective modeling of the global spectral pattern. Experiments validate the efficacy of the proposed strategies and show that the proposed model achieves competitive performance with only 0.89M parameters.
翻译:基于深神经网络的全频带扩音系统面临对计算资源的大量需求以及频率分布不平衡的挑战。本文提出了轻量级全频带模型,其中有两个专门战略,即为更有效的高频段信息压缩而绘制可学习的光谱压缩图,以及利用多点关注机制对全球光谱模式进行更有效的建模。实验验证了拟议战略的有效性,并表明拟议模式仅具备0.89M参数,取得了竞争性业绩。