Audio or visual data analysis tasks usually have to deal with high-dimensional and nonnegative signals. However, most data analysis methods suffer from overfitting and numerical problems when data have more than a few dimensions needing a dimensionality reduction preprocessing. Moreover, interpretability about how and why filters work for audio or visual applications is a desired property, especially when energy or spectral signals are involved. In these cases, due to the nature of these signals, the nonnegativity of the filter weights is a desired property to better understand its working. Because of these two necessities, we propose different methods to reduce the dimensionality of data while the nonnegativity and interpretability of the solution are assured. In particular, we propose a generalized methodology to design filter banks in a supervised way for applications dealing with nonnegative data, and we explore different ways of solving the proposed objective function consisting of a nonnegative version of the orthonormalized partial least-squares method. We analyze the discriminative power of the features obtained with the proposed methods for two different and widely studied applications: texture and music genre classification. Furthermore, we compare the filter banks achieved by our methods with other state-of-the-art methods specifically designed for feature extraction.
翻译:听觉或视觉数据分析任务通常必须处理高维和非否定性信号,然而,大多数数据分析方法都存在过分装配和数字问题,因为数据在几个层面以上,需要降低维度预处理前的处理;此外,关于过滤器如何和为什么对音或视觉应用起作用是一种理想属性的解释性,特别是在涉及能量或光谱信号的情况下;在这些情况下,由于这些信号的性质,过滤器重量的不增强性是更好理解其作用的一种理想属性。由于这两种需要,我们提出不同的方法来减少数据的多元性,而解决办法的非增强性和可解释性得到保证。特别是,我们提出一种通用的方法来以监督的方式设计过滤库,用于处理非否定性数据的应用,我们探索了不同的方法来解决拟议的目标功能的不同方法,即由非排斥性版本的正统部分平方位方法构成。我们分析了从两种不同和广泛研究的应用中获得的特征的区别性力量。我们用两种不同和广泛研究的应用方法,即:纹和音乐基因基因分类,我们具体地将国家设计的方法与其他提取方法进行比较。