Considering the microphone is easily affected by noise and soundproof materials, the radio frequency (RF) signal is a promising candidate to recover audio as it is immune to noise and can traverse many soundproof objects. In this paper, we introduce Radio2Speech, a system that uses RF signals to recover high quality speech from the loudspeaker. Radio2Speech can recover speech comparable to the quality of the microphone, advancing from recovering only single tone music or incomprehensible speech in existing approaches. We use Radio UNet to accurately recover speech in time-frequency domain from RF signals with limited frequency band. Also, we incorporate the neural vocoder to synthesize the speech waveform from the estimated time-frequency representation without using the contaminated phase. Quantitative and qualitative evaluations show that in quiet, noisy and soundproof scenarios, Radio2Speech achieves state-of-the-art performance and is on par with the microphone that works in quiet scenarios.
翻译:考虑到麦克风很容易受到噪音和隔音材料的影响,无线电频率(RF)信号是恢复音频的有希望的选择,因为它不受噪音的影响,可以穿越许多隔音物体。在本文中,我们引入了无线电2Speech,这是一个使用RPS信号从扩音器中恢复高质量语音的系统。无线电2Speech可以恢复与麦克风质量相当的语音,从只恢复单一音调音乐或在现有方法中无法理解的语音中前进。我们利用无线电UNet从频带有限的RF信号中准确恢复在时频域中的语音。此外,我们还采用神经电动器,将估计的时频表达方式与估计的时频表达方式合成。定量和定性评价显示,在安静、吵闹和隔音的情景下,无线电2Speech能够达到最先进的性能,并且与静态的麦克风保持同步。