序列转换中模拟前端对齐的结构重排序 (Structured Reordering for Modeling Latent Alignments in Sequence Transduction)

Despite success in many domains, neural models struggle in settings where train and test examples are drawn from different distributions. In particular, in contrast to humans, conventional sequence-to-sequence (seq2seq) models fail to generalize systematically, i.e., interpret sentences representing novel combinations of concepts (e.g., text segments) seen in training. Traditional grammar formalisms excel in such settings by implicitly encoding alignments between input and output segments, but are hard to scale and maintain. Instead of engineering a grammar, we directly model segment-to-segment alignments as discrete structured latent variables within a neural seq2seq model. To efficiently explore the large space of alignments, we introduce a reorder-first align-later framework whose central component is a neural reordering module producing {\it separable} permutations. We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations, and, thus, enabling end-to-end differentiable training of our model. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks (i.e., semantic parsing and machine translation).

翻译：尽管在许多领域都取得了成功,但神经模型在从不同分布区块中从火车和试验实例中得出来的环境下挣扎。特别是,与人类相反,常规序列到序列(seq2saqeq)模型未能系统地加以概括,也就是说,对培训中看到的概念的新组合(例如文字部分)的句子进行解释;传统语法形式主义在这种环境中表现突出,在输入和产出部分之间暗含编码,对输入和产出部分进行校正,但很难缩放和保持。我们不是设计一个语法,而是直接将区段到组合的对齐作为神经后继2seq模型中的离散结构潜在变量来模拟。为了有效地探索大范围的校正空间,我们引入了一个重新排序-顺序第一对齐相对框架,其核心部分是一个神经重新排序模块,产生 ~it separable} perposulations。我们展示了一种高效的动态编程算法,以精确的边际推推推,从而使得我们模型的最终到可变异的训练。由此产生的后代号模型展示了更系统化的常规化和结构化任务(Segraphly some2q salmalalal) lagidudustration sal sutional subild sild sald sald salmalmalmalmalds)。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/