Hyperbolic neural networks have shown great potential for modeling complex data. However, existing hyperbolic networks are not completely hyperbolic, as they encode features in a hyperbolic space yet formalize most of their operations in the tangent space (a Euclidean subspace) at the origin of the hyperbolic space. This hybrid method greatly limits the modeling ability of networks. In this paper, we propose a fully hyperbolic framework to build hyperbolic networks based on the Lorentz model by adapting the Lorentz transformations (including boost and rotation) to formalize essential operations of neural networks. Moreover, we also prove that linear transformation in tangent spaces used by existing hyperbolic networks is a relaxation of the Lorentz rotation and does not include the boost, implicitly limiting the capabilities of existing hyperbolic networks. The experimental results on four NLP tasks show that our method has better performance for building both shallow and deep networks. Our code will be released to facilitate follow-up research.
翻译:超偏心神经网络已经显示出巨大的模拟复杂数据的潜力。 但是,现有的双曲网络并不是完全超双曲的,因为它们在超双曲空间中编码功能,但在超双曲空间的起源点,它们大多数在正切空间(一个欧clidean子空间)的操作正式化。这种混合方法极大地限制了网络的建模能力。在本文中,我们提出一个完全超双曲框架,以基于Lorentz模型建立双曲网络,办法是调整Lorentz的变换(包括推动和旋转)以正式确定神经网络的基本操作。此外,我们还证明现有双曲网络使用的正切空间的线性变换是洛伦茨旋转空间的松动,但不包括推动,隐含限制现有双曲网络的能力。 四个NLP任务的实验结果显示,我们的方法在建设浅层和深层网络方面都有更好的性能。 我们的代码将会发布,以便利后续研究。