We introduce a block-online variant of the temporal feature-wise linear modulation (TFiLM) model to achieve bandwidth extension. The proposed architecture simplifies the UNet backbone of the TFiLM to reduce inference time and employs an efficient transformer at the bottleneck to alleviate performance degradation. We also utilize self-supervised pretraining and data augmentation to enhance the quality of bandwidth extended signals and reduce the sensitivity with respect to downsampling methods. Experiment results on the VCTK dataset show that the proposed method outperforms several recent baselines in both intrusive and non-intrusive metrics. Pretraining and filter augmentation also help stabilize and enhance the overall performance.
翻译:为了实现带宽扩展,我们引入了时间特征线性调制(TFILM)模式的区块式变体。拟议的结构简化了TFILM的UNet主干网,以减少推论时间,并在瓶颈处使用高效变压器,以缓解性能退化。我们还利用自我监督的预培训和数据增强来提高带宽扩展信号的质量,降低对降压方法的敏感度。VCTK数据集的实验结果表明,拟议的方法在侵入性和非侵入性指标方面都超过了最近几个基线。预先培训和过滤增强还有助于稳定和提高总体性能。