Abstract Meaning Representations (AMR) are a broad-coverage semantic formalism which represents sentence meaning as a directed acyclic graph. To train most AMR parsers, one needs to segment the graph into subgraphs and align each such subgraph to a word in a sentence; this is normally done at preprocessing, relying on hand-crafted rules. In contrast, we treat both alignment and segmentation as latent variables in our model and induce them as part of end-to-end training. As marginalizing over the structured latent variables is infeasible, we use the variational autoencoding framework. To ensure end-to-end differentiable optimization, we introduce a differentiable relaxation of the segmentation and alignment problems. We observe that inducing segmentation yields substantial gains over using a `greedy' segmentation heuristic. The performance of our method also approaches that of a model that relies on the segmentation rules of \citet{lyu-titov-2018-amr}, which were hand-crafted to handle individual AMR constructions.
翻译:抽象表示法(AMR)是一个广泛覆盖的语义化形式学,它代表了作为定向循环图的句子含义。为了培训大多数AMR分析员,人们需要将图表分解成一个子集,并将每个子集成与一个句子中的单词相匹配;这通常是在预处理时完成的,依靠手工制作的规则。相反,我们把对齐和分解作为我们模型中的潜在变量,并引导它们成为端到端培训的一部分。由于结构化潜在变量的边缘化是行不通的,我们使用变式自动编码框架。为了确保端到端的优化,我们引入了对分解和对齐问题的不同程度的可变的放松。我们观察到,诱导分解会比使用“致幻”分解法的超常态产生巨大收益。我们方法的性能还接近了一种模型的模型,该模型依赖\citet{lyu-tov-2018-amr},是手制的,用来处理单项MR结构。