Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event. Experiments on the DESED database show that the proposed method can simultaneously improve 13.0\% and 3.8\% in time delay and detection F1 score compared to the state-of-the-art methods.
翻译:声学早期发现(SEED)是认识声学环境和声学场景的一项基本任务,但是,大多数现有方法都侧重于离线性声学探测,这与早期事件探测的过度自信问题有关,通常会产生不可靠的结果。为了解决这个问题,我们建议采用新颖的多功能天体神经网络(Penet)来模拟Beta分布的班级概率的证据不确定性。具体地说,我们使用Beta分布来模拟等级概率的分布,而证据不确定性则通过证据信息来丰富不确定性的代表性,而证据信息在可靠的预测中起着核心作用。为了进一步改善事件探测性能,我们设计了利用当前事件的前向和后向音学特点的反轨推断方法。对DESED数据库的实验表明,拟议的方法可以同时在时间上改进13.0 ⁇ 和3.8 ⁇,并在时间上比最新方法改进13.0 ⁇ 和3.8 ⁇ 。