In this work, we present HIDACT, a novel network architecture for adaptive computation for efficiently recognizing acoustic events. We evaluate the model on a sound event detection task where we train it to adaptively process frequency bands. The model learns to adapt to the input without requesting all frequency sub-bands provided. It can make confident predictions within fewer processing steps, hence reducing the amount of computation. Experimental results show that HIDACT has comparable performance to baseline models with more parameters and higher computational complexity. Furthermore, the model can adjust the amount of computation based on the data and computational budget.
翻译:在这项工作中,我们介绍了一个用于高效识别声学事件的新颖的适应性计算网络架构,即HIDACT,这是一个用于高效识别声学事件的新网络架构。我们评估了用于对它进行适应性处理频率波段培训的正确事件探测任务模型。模型学会适应输入,而无需要求提供所有频率子波段。它可以在较少的处理步骤中作出自信的预测,从而减少计算数量。实验结果表明,HIDACT的性能与基准模型相比,参数更多,计算复杂程度更高。此外,该模型还可以根据数据和计算预算调整计算数量。