While being disturbed by environmental noises, the acoustic masking technique is a conventional way to reduce the annoyance in audio engineering that seeks to cover up the noises with other dominant yet less intrusive sounds. However, misalignment between the dominant sound and the noise-such as mismatched downbeats-often requires an excessive volume increase to achieve effective masking. Motivated by recent advances in cross-modal generation, in this work, we introduce an alternative method to acoustic masking, aiming to reduce the noticeability of environmental noises by blending them into personalized music generated based on user-provided text prompts. Following the paradigm of music generation using mel-spectrogram representations, we propose a Blending Noises into Personalized Music (BNMusic) framework with two key stages. The first stage synthesizes a complete piece of music in a mel-spectrogram representation that encapsulates the musical essence of the noise. In the second stage, we adaptively amplify the generated music segment to further reduce noise perception and enhance the blending effectiveness, while preserving auditory quality. Our experiments with comprehensive evaluations on MusicBench, EPIC-SOUNDS, and ESC-50 demonstrate the effectiveness of our framework, highlighting the ability to blend environmental noise with rhythmically aligned, adaptively amplified, and enjoyable music segments, minimizing the noticeability of the noise, thereby improving overall acoustic experiences. Project page: https://d-fas.github.io/BNMusic_page/.
翻译:在受到环境噪声干扰时,声学掩蔽技术是音频工程中一种常规的降噪方法,旨在用其他主导性较强但侵扰性较低的声音来覆盖噪声。然而,主导声音与噪声之间的不匹配——例如节拍错位——通常需要大幅提高音量才能实现有效掩蔽。受跨模态生成领域最新进展的启发,本文提出一种替代声学掩蔽的方法,旨在通过将环境噪声融入基于用户文本提示生成的个性化音乐中来降低噪声的感知显著性。遵循使用梅尔频谱图表示进行音乐生成的范式,我们提出了一个包含两个关键阶段的“将噪声融入个性化音乐”(BNMusic)框架。第一阶段合成一段完整的梅尔频谱图表示音乐,该音乐封装了噪声的音乐本质。在第二阶段,我们自适应地放大生成的音乐片段,以进一步降低噪声感知并增强融合效果,同时保持听觉质量。我们在MusicBench、EPIC-SOUNDS和ESC-50数据集上进行的综合评估实验证明了该框架的有效性,突显了其能够将环境噪声与节奏对齐、自适应放大且悦耳的音乐片段相融合,从而最小化噪声的感知显著性,提升整体听觉体验。项目页面:https://d-fas.github.io/BNMusic_page/。