Low dimensional nonlinear structure abounds in datasets across computer vision and machine learning. Kernelized matrix factorization techniques have recently been proposed to learn these nonlinear structures for denoising, classification, dictionary learning, and missing data imputation, by observing that the image of the matrix in a sufficiently large feature space is low-rank. However, these nonlinear methods fail in the presence of sparse noise or outliers. In this work, we propose a new robust nonlinear factorization method called Robust Non-Linear Matrix Factorization (RNLMF). RNLMF constructs a dictionary for the data space by factoring a kernelized feature space; a noisy matrix can then be decomposed as the sum of a sparse noise matrix and a clean data matrix that lies in a low dimensional nonlinear manifold. RNLMF is robust to sparse noise and outliers and scales to matrices with thousands of rows and columns. Empirically, RNLMF achieves noticeable improvements over baseline methods in denoising and clustering.
翻译:跨计算机视野和机器学习的数据集中存在着大量的低维非线性结构。最近,为了学习这些非线性结构以进行拆卸、分类、字典学习和缺失的数据估算,提出了内核矩阵化要素化技术,通过观察在足够大的地貌空间中矩阵的图像是低空的。然而,这些非线性方法在有稀少的噪音或外源的情况下失败。在这项工作中,我们提出了一种新的强势非线性非线性因子化方法,称为Robust Non-Linear 矩阵因数化(RNLMF) 。RNLMF通过对内核特征空间进行系数化,为数据空间构建了词典;随后,噪音矩阵可以作为分散噪音矩阵和清洁数据矩阵的总和进行分解。RNLMF对稀有噪音,外源和量级与数千行和列的基质的基体进行稀释。Empiric,RNLMF在脱色和集群的基准方法上取得了显著的改进。