Neural network models often generalize poorly to mismatched domains or distributions. In NLP, this issue arises in particular when models are expected to generalize compositionally, that is, to novel combinations of familiar words and constructions. We investigate learning representations that facilitate transfer learning from one compositional task to another: the representation and the task-specific layers of the models are strategically trained differently on a pre-finetuning task such that they generalize well on mismatched splits that require compositionality. We apply this method to semantic parsing, using three very different datasets, COGS, GeoQuery and SCAN, used alternately as the pre-finetuning and target task. Our method significantly improves compositional generalization over baselines on the test set of the target task, which is held out during fine-tuning. Ablation studies characterize the utility of the major steps in the proposed algorithm and support our hypothesis.
翻译:神经网络模型往往对不匹配的域名或分布进行不均匀的概括化。 在《国家劳工政策》中,这一问题尤其出现在预期模型对组成进行概括化时,即对熟悉词汇和构造进行新组合时。我们调查有助于将学习从一个组成任务向另一个组成任务转移的学习表现:模型的表述和具体任务层次在战略上就一项改进前任务进行了不同的培训,这样它们就可以对需要构成的不匹配的分割进行广泛化。我们用这种方法对语义解析,使用三种非常不同的数据集,即COGS、GeoQuery和SCAN, 替代地用作预调整和目标任务。我们的方法大大改进了目标任务测试基准的构成概括化,该测试是在微调期间完成的。进化研究说明了拟议算法中主要步骤的效用并支持我们的假设。