Scale is often seen as a given, disturbing factor in many vision tasks. When doing so it is one of the factors why we need more data during learning. In recent work scale equivariance was added to convolutional neural networks. It was shown to be effective for a range of tasks. We aim for accurate scale-equivariant convolutional neural networks (SE-CNNs) applicable for problems where high granularity of scale and small filter sizes are required. Current SE-CNNs rely on weight sharing and filter rescaling, the latter of which is accurate for integer scales only. To reach accurate scale equivariance, we derive general constraints under which scale-convolution remains equivariant to discrete rescaling. We find the exact solution for all cases where it exists, and compute the approximation for the rest. The discrete scale-convolution pays off, as demonstrated in a new state-of-the-art classification on MNIST-scale and improving the results on STL-10. With the same SE scheme, we also improve the computational effort of a scale-equivariant Siamese tracker on OTB-13.
翻译:在很多愿景任务中, 规模往往被视为一个特定、 令人不安的因素。 当它成为我们学习时需要更多数据的因素之一。 在近期的工作规模差异被添加到进化神经网络中。 它被证明对一系列任务有效。 我们的目标是精确的规模- 等变共振神经网络( SE- CNNs), 适用于需要高颗粒规模和小过滤尺寸的问题。 当前的 SE- CNN 依赖权重共享和过滤再缩放, 后者仅对整级规模准确。 为了达到精确的规模差异, 我们得出了总体限制, 而在这种限制下, 比额表变异仍然不易离散地调整。 我们为所有存在的情况找到精确的解决方案, 并对其余情况进行近似化。 离散规模变异的演算结果会减少, 正如新的MNIST- L- 10 级技术分类所显示的, 并改进STL- 10 的结果。 为了实现精确的整级规模变换, 我们还改进了SAL- QRA- TRA 的计算工作。