Real-world visual data exhibit intrinsic hierarchical structures that can be represented effectively in hyperbolic spaces. Hyperbolic neural networks (HNNs) are a promising approach for learning feature representations in such spaces. However, current methods in computer vision rely on Euclidean backbones and only project features to the hyperbolic space in the task heads, limiting their ability to fully leverage the benefits of hyperbolic geometry. To address this, we present HCNN, the first fully hyperbolic convolutional neural network (CNN) designed for computer vision tasks. Based on the Lorentz model, we generalize fundamental components of CNNs and propose novel formulations of the convolutional layer, batch normalization, and multinomial logistic regression (MLR). Experimentation on standard vision tasks demonstrates the effectiveness of our HCNN framework and the Lorentz model in both hybrid and fully hyperbolic settings. Overall, we aim to pave the way for future research in hyperbolic computer vision by offering a new paradigm for interpreting and analyzing visual data. Our code is publicly available at https://github.com/kschwethelm/HyperbolicCV.
翻译:真实世界的视觉数据表现出固有的分层结构,可以在超几何空间中有效地表示。超几何神经网络(HNN)是一种在这种空间中学习特征表示的有前途的方法。然而,当前计算机视觉中使用的方法仍然依赖于欧几里得主干,并仅在任务头中将特征投影到超几何空间中,这限制了它们充分利用超几何空间的好处的能力。为了解决这个问题,我们提出了HCNN,第一个专为计算机视觉任务设计的全超几何卷积神经网络(CNN)。基于洛伦兹模型,我们推广了CNN的基本组件,并提出了卷积层、批量归一化和多项式逻辑回归的新 formulation。在标准视觉任务上的实验证明了我们的HCNN框架和Lorentz模型在混合和全超几何设置中的有效性。总体而言,我们旨在通过提供一种新的范例来解释和分析视觉数据,为超几何计算机视觉的未来研究铺平道路。我们的代码公开在https://github.com/kschwethelm/HyperbolicCV。