Sparse coding has been proposed as a theory of visual cortex and as an unsupervised algorithm for learning representations. We show empirically with the MNIST dataset that sparse codes can be very sensitive to image distortions, a behavior that may hinder invariant object recognition. A locally linear analysis suggests that the sensitivity is due to the existence of linear combinations of active dictionary elements with high cancellation. A nearest neighbor classifier is shown to perform worse on sparse codes than original images. For a linear classifier with a sufficiently large number of labeled examples, sparse codes are shown to yield higher accuracy than original images, but no higher than a representation computed by a random feedforward net. Sensitivity to distortions seems to be a basic property of sparse codes, and one should be aware of this property when applying sparse codes to invariant object recognition.
翻译:以视觉皮层理论和未经监督的学习演示算法来提出粗略编码。 我们用MNIST数据集的经验显示,稀有代码对图像扭曲非常敏感,而这种扭曲行为可能妨碍对变量对象的识别。 本地线性分析表明,敏感度是由于存在动态字典元素的线性组合而高注销。 最近的邻居分类器在稀有代码上的表现比原始图像差。 对于具有足够多标签实例的线性分类器来说,稀有代码显示的准确性高于原始图像,但并不高于随机向上传输网络计算的数字。 扭曲的敏感度似乎是稀有代码的基本属性,当对变量对象识别应用稀有代码时,应该意识到这种属性。