Recent advances in deep neural networks (DNNs) have significantly improved various audio processing applications, including speech enhancement, synthesis, and hearing-aid algorithms. DNN-based closed-loop systems have gained popularity in these applications due to their robust performance and ability to adapt to diverse conditions. Despite their effectiveness, current DNN-based closed-loop systems often suffer from sound quality degradation caused by artifacts introduced by suboptimal sampling methods. To address this challenge, we introduce dCoNNear, a novel DNN architecture designed for seamless integration into closed-loop frameworks. This architecture specifically aims to prevent the generation of spurious artifacts-most notably tonal and aliasing artifacts arising from non-ideal sampling layers. We demonstrate the effectiveness of dCoNNear through a proof-of-principle example within a closed-loop framework that employs biophysically realistic models of auditory processing for both normal and hearing-impaired profiles to design personalized hearing-aid algorithms. We further validate the broader applicability and artifact-free performance of dCoNNear through speech-enhancement experiments, confirming its ability to improve perceptual sound quality without introducing architecture-induced artifacts. Our results show that dCoNNear not only accurately simulates all processing stages of existing non-DNN biophysical models but also significantly improves sound quality by eliminating audible artifacts in both hearing-aid and speech-enhancement applications. This study offers a robust, perceptually transparent closed-loop processing framework for high-fidelity audio applications.
翻译:深度神经网络(DNN)的最新进展显著提升了多种音频处理应用,包括语音增强、合成及助听算法。基于DNN的闭环系统因其鲁棒性能及适应多样化条件的能力,在这些应用中日益普及。尽管效果显著,当前基于DNN的闭环系统常因次优采样方法引入的伪影而导致音质下降。为解决这一挑战,我们提出了dCoNNear,一种专为无缝集成闭环框架而设计的新型DNN架构。该架构特别旨在防止生成虚假伪影——尤其是由非理想采样层引起的音调伪影和混叠伪影。我们通过一个原理验证示例,在采用生物物理逼真听觉处理模型(涵盖正常与听力受损人群)以设计个性化助听算法的闭环框架中,展示了dCoNNear的有效性。我们进一步通过语音增强实验验证了dCoNNear的广泛适用性和无伪影性能,确认其能在不引入架构诱发伪影的前提下提升感知音质。结果表明,dCoNNear不仅能精确模拟现有非DNN生物物理模型的所有处理阶段,还能通过消除助听和语音增强应用中的可听伪影,显著改善音质。本研究为高保真音频应用提供了一个鲁棒且感知透明的闭环处理框架。