The manifold hypothesis (real world data concentrates near low-dimensional manifolds) is suggested as the principle behind the effectiveness of machine learning algorithms in very high dimensional problems that are common in domains such as vision and speech. Multiple methods have been proposed to explicitly incorporate the manifold hypothesis as a prior in modern Deep Neural Networks (DNNs), with varying success. In this paper, we propose a new method, Distance Learner, to incorporate this prior for DNN-based classifiers. Distance Learner is trained to predict the distance of a point from the underlying manifold of each class, rather than the class label. For classification, Distance Learner then chooses the class corresponding to the closest predicted class manifold. Distance Learner can also identify points as being out of distribution (belonging to neither class), if the distance to the closest manifold is higher than a threshold. We evaluate our method on multiple synthetic datasets and show that Distance Learner learns much more meaningful classification boundaries compared to a standard classifier. We also evaluate our method on the task of adversarial robustness, and find that it not only outperforms standard classifier by a large margin, but also performs at par with classifiers trained via state-of-the-art adversarial training.
翻译:多重假设( 真实世界数据在低维方位附近 ) 被建议为机器学习算法在视觉和语言等领域常见的高度问题中的有效性背后的原则。 已经提出了多种方法, 将多重假设明确纳入现代深神经网络( DNN) 之前, 并取得了不同成功 。 在本文中, 我们提出了一种新的方法, 即远程学习者, 以纳入基于 DNN 的分类器之前。 远程学习者受过培训, 以预测每个类的下方点的距离, 而不是等级标签 。 对于分类, 远程学习者然后选择与最接近的预测级数相匹配的等级。 远程学习者还可以指出, 如果最接近的多元值高于阈值, 则会显示, 我们用多合成数据集来评估我们的方法, 并显示远程学习者比标准分类器更有意义的分类界限。 我们还评估了我们关于对抗性稳健性的任务的方法, 并且发现, 不仅通过大幅度的对抗性训练, 而且还会用普通的军衔来演。