A method is proposed for performing speech enhancement using ego-noise references with a microphone array embedded in an unmanned aerial vehicle (UAV). The ego-noise reference signals are captured with microphones located near the UAV's propellers and used in the prior knowledge multichannel Wiener filter (PK-MWF) to obtain the speech correlation matrix estimate. Speech presence probability (SPP) can be estimated for detecting speech activity from an external microphone near the speech source, providing a performance benchmark, or from one of the embedded microphones, assuming a more realistic scenario. Experimental measurements are performed in a semi-anechoic chamber, with a UAV mounted on a stand and a loudspeaker playing a speech signal, while setting three distinct and fixed propeller rotation speeds, resulting in three different signal-to-noise ratios (SNRs). The recordings obtained and made available online are used to compare the proposed method to the use of the standard multichannel Wiener filter (MWF) estimated with and without the propellers' microphones being used in its formulation. Results show that compared to those, the use of PK-MWF achieves higher levels of improvement in speech intelligibility and quality, measured by STOI and PESQ, while the SNR improvement is similar.
翻译:使用无人驾驶航空器(无人驾驶航空器)内嵌入的麦克风阵列进行自我噪音参考增强语音的方法。自我噪音参考信号用位于无人驾驶航空器螺旋桨附近的麦克风捕捉,并用于前知识多频道Wiener过滤器(PK-MWF),以获得语音相关矩阵估计。可以估计语音存在概率(SPP),以探测语音源附近外部麦克风的语音活动,提供性能基准,或从嵌入式麦克风中的一个麦克风中探测语音活动,假设情况更加现实。实验测量是在半心脏室进行,在半心脏室安装UAVAV,用扩音器播放语音信号,同时设定三种不同和固定的螺旋螺旋桨旋转速度,从而得出三种不同的信号到噪音比率(SNRIS)。获得和在线提供的录音概率(SPPP)可用于比较拟议方法与使用标准多声波韦纳过滤器(MFFFS)所估计和不使用螺旋动式麦克风器的情况。结果显示,相对于这些速度,使用PK-FES质量和Strealvial Q的改进程度,而测量为Slima-FES在Starvial Q中测量和Starvialvialvial的改进程度。