This article presents a hybrid approach between scale-space theory and deep learning, where a deep learning architecture is constructed by coupling parameterized scale-space operations in cascade. By sharing the learnt parameters between multiple scale channels, and by using the transformation properties of the scale-space primitives under scaling transformations, the resulting network becomes provably scale covariant. By in addition performing max pooling over the multiple scale channels, a resulting network architecture for image classification also becomes provably scale invariant. We investigate the performance of such networks on the MNISTLargeScale dataset, which contains rescaled images from original MNIST over a factor 4 concerning training data and over a factor of 16 concerning testing data. It is demonstrated that the resulting approach allows for scale generalization, enabling good performance for classifying patterns at scales not present in the training data.
翻译:本文介绍了比例空间理论与深层学习之间的一种混合方法,这种深层学习结构是通过级联中的参数化比例空间操作构建的。通过共享多个比例化频道之间所学到的参数,并通过使用规模化变换中的比例化空间原始体的转化特性,由此形成的网络变得可辨别的规模化共变性。除了在多个比例化频道上进行最大限度的集合外,由此形成的图像分类网络结构也变得可辨别的规模化。我们研究了MNISTLargeBase数据集中这类网络的性能,该数据集包含原始的MNIST在与培训数据有关的要素4上的重新标定的图像,以及在与测试数据有关的要素16的变异性。我们发现,由此形成的方法允许了规模化,使得在培训数据中不存在的尺度上对模式进行分类的工作表现良好。