In this paper, we revisit the use of spectrograms in neural networks, by making the window length a continuous parameter optimizable by gradient descent instead of an empirically tuned integer-valued hyperparameter. The contribution is mostly theoretical at this point, but plugging the modified STFT into any existing neural network is straightforward. We first define a differentiable version of the STFT in the case where local bins centers are fixed and independent of the window length parameter. We then discuss the more difficult case where the window length affects the position and number of bins. We illustrate the benefits of this new tool on an estimation and a classification problems, showing it can be of interest not only to neural networks but to any STFT-based signal processing algorithm.
翻译:在本文中,我们重新审视神经网络中光谱图的使用情况,使窗口长度成为一个通过梯度下降而不是经验调整的整数值超参数来优化的连续参数。 贡献大多是理论性的, 但将修改过的STFT插入任何现有的神经网络是直截了当的。 我们首先在本地垃圾箱中心是固定的且独立于窗口长度参数的情况下定义了一个不同的STFT版本。 然后我们讨论窗口长度影响垃圾箱位置和数量的更困难案例。 我们演示了这一新工具对估计和分类问题的好处, 表明它不仅对神经网络感兴趣,而且对任何基于STFT的信号处理算法也感兴趣。