Recent progress on parse tree encoder for sentence representation learning is notable. However, these works mainly encode tree structures recursively, which is not conducive to parallelization. On the other hand, these works rarely take into account the labels of arcs in dependency trees. To address both issues, we propose Dependency-Transformer, which applies a relation-attention mechanism that works in concert with the self-attention mechanism. This mechanism aims to encode the dependency and the spatial positional relations between nodes in the dependency tree of sentences. By a score-based method, we successfully inject the syntax information without affecting Transformer's parallelizability. Our model outperforms or is comparable to the state-of-the-art methods on four tasks for sentence representation and has obvious advantages in computational efficiency.
翻译:然而,这些工作主要是对树结构进行重新编码,这不利于平行化。另一方面,这些工作很少考虑到依赖树的弧形标签。为了解决这两个问题,我们提议采用一个与自留机制协调的关系关注机制。这个机制旨在将依赖树的节点之间的依赖关系和空间定位关系编码。我们通过一种基于分数的方法,成功地输入了语法信息,而不会影响变形人的平行性。我们的模型优于或可与最先进的四个判决代表任务的方法相比。这个机制在计算效率方面具有明显的优势。