We describe a minimalistic and interpretable method for unsupervised learning, without resorting to data augmentation, hyperparameter tuning, or other engineering designs, that achieves performance close to the SOTA SSL methods. Our approach leverages the sparse manifold transform, which unifies sparse coding, manifold learning, and slow feature analysis. With a one-layer deterministic sparse manifold transform, one can achieve 99.3% KNN top-1 accuracy on MNIST, 81.1% KNN top-1 accuracy on CIFAR-10 and 53.2% on CIFAR-100. With a simple gray-scale augmentation, the model gets 83.2% KNN top-1 accuracy on CIFAR-10 and 57% on CIFAR-100. These results significantly close the gap between simplistic ``white-box'' methods and the SOTA methods. Additionally, we provide visualization to explain how an unsupervised representation transform is formed. The proposed method is closely connected to latent-embedding self-supervised methods and can be treated as the simplest form of VICReg. Though there remains a small performance gap between our simple constructive model and SOTA methods, the evidence points to this as a promising direction for achieving a principled and white-box approach to unsupervised learning.
翻译:我们描述了一种不使用数据增强、超参数调试或其他工程设计,实现接近SOTA SSL方法的性能的不监督学习的最起码和可解释的方法。我们的方法利用了稀少的多式变换,这种变换将稀少的编码、多式学习和慢式特征分析统一起来。通过一层确定性稀释多式变换,我们可以在MNIST上达到99.3% KNN最高1级的精确度,在CIFAR-10上达到81.1% KNN最高1级的精确度,在CIFAR-100上达到53.2%的精确度。在简单的灰度扩增中,该模型在CIFAR-10上获得83.2% KNNN最高1级的精确度,在CIFAR-100上达到57%。这些结果大大缩小了简单化的“白箱”方法与SOTA方法之间的差距。此外,我们提供了直观化解释如何形成不受监督的代言的变换方式。拟议方法与潜入式自我监督的方法密切相关,可以被视为国际中心最简单的形式。尽管我们简单的建设性模式和SOTA-SOTA方法之间在取得无希望的方向上的证据点。