Most prior work on exemplar-based syntactically controlled paraphrase generation relies on automatically-constructed large-scale paraphrase datasets. We sidestep this prerequisite by adapting models from prior work to be able to learn solely from bilingual text (bitext). Despite only using bitext for training, and in near zero-shot conditions, our single proposed model can perform four tasks: controlled paraphrase generation in both languages and controlled machine translation in both language directions. To evaluate these tasks quantitatively, we create three novel evaluation datasets. Our experimental results show that our models achieve competitive results on controlled paraphrase generation and strong performance on controlled machine translation. Analysis shows that our models learn to disentangle semantics and syntax in their latent representations.
翻译:先前大多数基于原创的以原创综合控制的原创生成工作都依赖于自动构建的大型原创原句数据集。 我们通过调整先前工作的模型来绕过这一先决条件,以便只从双语文本(bext)中学习。 尽管我们只使用比特ext进行培训,而且在几乎零发条件下,我们提出的单一模式可以执行四项任务:用两种语言进行受控的原译生成,用两种语言进行受控的机器翻译。为了定量评估这些任务,我们创建了三个新的评价数据集。我们的实验结果表明,我们的模型在受控的原创副句生成和受控的机器翻译上取得了竞争性结果。分析表明,我们的模型在潜在表现中学会解析语义和语法。