Matrix approximations are a key element in large-scale algebraic machine learning approaches. The recently proposed method MEKA (Si et al., 2014) effectively employs two common assumptions in Hilbert spaces: the low-rank property of an inner product matrix obtained from a shift-invariant kernel function and a data compactness hypothesis by means of an inherent block-cluster structure. In this work, we extend MEKA to be applicable not only for shift-invariant kernels but also for non-stationary kernels like polynomial kernels and an extreme learning kernel. We also address in detail how to handle non-positive semi-definite kernel functions within MEKA, either caused by the approximation itself or by the intentional use of general kernel functions. We present a Lanczos-based estimation of a spectrum shift to develop a stable positive semi-definite MEKA approximation, also usable in classical convex optimization frameworks. Furthermore, we support our findings with theoretical considerations and a variety of experiments on synthetic and real-world data.
翻译:矩阵近似值是大型代数机器学习方法中的一个关键要素。最近提出的方法MEKA(Si等人,2014年)在希尔伯特空间中有效地采用了两种共同假设:从一个变换内核函数中获得的内部产品矩阵的低位属性和通过一个固有的区块集群结构获得的数据紧凑性假设。在这项工作中,我们扩展MEKA不仅适用于变换内核,也适用于多边内核和极端学习内核等非静止内核。我们还详细讨论了如何在MEKA内处理非正半定内核功能,要么是由于近似本身,要么是因为有意使用一般内核功能造成的。我们提出了基于蓝索的光谱变估计,以开发稳定的正半成型半成型的MEKA近似值,这也可用于古典的锥形优化框架。此外,我们还以理论考虑和关于合成和现实世界数据的各种实验来支持我们的调查结果。