In text-to-SQL tasks -- as in much of NLP -- compositional generalization is a major challenge: neural networks struggle with compositional generalization where training and test distributions differ. However, most recent attempts to improve this are based on word-level synthetic data or specific dataset splits to generate compositional biases. In this work, we propose a clause-level compositional example generation method. We first split the sentences in the Spider text-to-SQL dataset into sub-sentences, annotating each sub-sentence with its corresponding SQL clause, resulting in a new dataset Spider-SS. We then construct a further dataset, Spider-CG, by composing Spider-SS sub-sentences in different combinations, to test the ability of models to generalize compositionally. Experiments show that existing models suffer significant performance degradation when evaluated on Spider-CG, even though every sub-sentence is seen during training. To deal with this problem, we modify a number of state-of-the-art models to train on the segmented data of Spider-SS, and we show that this method improves the generalization performance.
翻译:在文本到 SQL 任务中,如同在很多 NLP 中一样, 组成概括化是一个重大挑战: 神经网络与构成概括化斗争, 培训和测试分布不尽相同。 然而, 最近试图改进这一点的尝试是以字级合成数据或特定数据集分割为基础, 以产生组成偏差。 在这项工作中, 我们提出一个条款级的构成示例生成方法。 我们首先将蜘蛛文本到 SQL 数据集的句子分为次句子句子, 用相应的 SQL 条款来说明, 从而产生一个新的数据集蜘蛛SS 。 然后我们再构建一个数据集, 蜘蛛CG, 将蜘蛛SS 子句子组合成不同的组合, 以测试模型对组成偏差的能力。 实验显示, 在对蜘蛛CG 进行评估时, 现有模型的性能严重退化, 尽管在培训过程中看到每个子句子。 为了解决这个问题, 我们修改了一些最先进的模型, 来培训蜘蛛SS 的分解数据, 我们展示了这种方法的性能改进。