The constant center frequency to bandwidth ratio (Q-factor) of wavelet transforms provides a very natural representation for audio data. However, invertible wavelet transforms have either required non-uniform decimation -- leading to irregular data structures that are cumbersome to work with -- or require excessively high oversampling with unacceptable computational overhead. Here, we present a novel decimation strategy for wavelet transforms that leads to stable representations with oversampling rates close to one and uniform decimation. Specifically, we show that finite implementations of the resulting representation are energy-preserving in the sense of frame theory. The obtained wavelet coefficients can be stored in a timefrequency matrix with a natural interpretation of columns as time frames and rows as frequency channels. This matrix structure immediately grants access to a large number of algorithms that are successfully used in time-frequency audio processing, but could not previously be used jointly with wavelet transforms. We demonstrate the application of our method in processing based on nonnegative matrix factorization, in onset detection, and in phaseless reconstruction.
翻译:波子变换的中位频率与带宽比(Q-因子)的恒定频率与带宽比(Q-因子)为音频数据提供了非常自然的表示。然而,不可逆的波子变换要么需要非统一化的毁灭 -- -- 导致数据结构不规则 -- -- 导致工作繁琐 -- -- 要么要求过高的过度抽样,无法接受的计算间接费用。在这里,我们提出了一个新颖的波子变换战略,导致波子变换的稳定表现,其过度采样率接近于一个和统一的毁灭率。具体地说,我们表明,由此产生的代表的有限执行是框架理论意义上的节能。获得的波子系数可以存储在时间频谱矩阵中,自然解释作为时间框架的列和行作为频率频道的列。这个矩阵结构立即提供大量算法,这些算法成功地用于时间频率音频处理,但以前无法与波子变换相同时使用。我们用的方法在非负式矩阵化、初始检测和无序的重建中进行加工。我们展示了我们的处理方法的应用。