This paper introduces an end-to-end residual network that operates entirely on the Poincar\'e ball model of hyperbolic space. Hyperbolic learning has recently shown great potential for visual understanding, but is currently only performed in the penultimate layer(s) of deep networks. All visual representations are still learned through standard Euclidean networks. In this paper we investigate how to learn hyperbolic representations of visual data directly from the pixel-level. We propose Poincar\'e ResNet, a hyperbolic counterpart of the celebrated residual network, starting from Poincar\'e 2D convolutions up to Poincar\'e residual connections. We identify three roadblocks for training convolutional networks entirely in hyperbolic space and propose a solution for each: (i) Current hyperbolic network initializations collapse to the origin, limiting their applicability in deeper networks. We provide an identity-based initialization that preserves norms over many layers. (ii) Residual networks rely heavily on batch normalization, which comes with expensive Fr\'echet mean calculations in hyperbolic space. We introduce Poincar\'e midpoint batch normalization as a faster and equally effective alternative. (iii) Due to the many intermediate operations in Poincar\'e layers, we lastly find that the computation graphs of deep learning libraries blow up, limiting our ability to train on deep hyperbolic networks. We provide manual backward derivations of core hyperbolic operations to maintain manageable computation graphs.
翻译:本文介绍了一种完全基于Poincaré双曲球模型运作的端到端残差网络。近来,超几何学习已经展现了在视觉理解方面的巨大潜力,但目前仅在深度网络的倒数第二层或倒数第一层中执行超几何学习。所有视觉表示仍然通过标准的欧几里得网络学习。在本文中,我们调查了如何直接从像素级别学习视觉数据的超几何表示。我们提出了Poincaré ResNet,这是一个著名残差网络的超几何对应物,从Poincaré 2D卷积开始,直到Poincaré残差连接。我们发现,在纯双曲空间中训练卷积网络存在三个障碍,并针对每个障碍提出了解决方案:(i) 当前的超几何网络初始化崩溃为原点,限制了它们在更深的网络中的适用性。我们提供了一个基于身份的初始化,可以在许多层上保留范数。(ii) 残差网络严重依赖批归一化,在超几何空间中带有昂贵的Fréchet均值计算。我们引入了Poincaré中点批归一化作为更快且同样有效的替代方案。(iii) 由于Poincaré层中的许多中间操作,最后我们发现深度学习库的计算图膨胀,限制了我们在深度超几何网络上训练的能力。我们提供了核心超几何操作的手动反向推导,以保持可管理的计算图。