Compositional generalization refers to a model's capability to generalize to newly composed input data based on the data components observed during training. It has triggered a series of compositional generalization analysis on different tasks as generalization is an important aspect of language and problem solving skills. However, the similar discussion on math word problems (MWPs) is limited. In this manuscript, we study compositional generalization in MWP solving. Specifically, we first introduce a data splitting method to create compositional splits from existing MWP datasets. Meanwhile, we synthesize data to isolate the effect of compositions. To improve the compositional generalization in MWP solving, we propose an iterative data augmentation method that includes diverse compositional variation into training data and could collaborate with MWP methods. During the evaluation, we examine a set of methods and find all of them encounter severe performance loss on the evaluated datasets. We also find our data augmentation method could significantly improve the compositional generalization of general MWP methods. Code is available at https://github.com/demoleiwang/CGMWP.
翻译:总体构成是指一种模型能够根据培训期间观察到的数据组成部分对新成的输入数据进行概括分析,它引发了一系列对不同任务进行整体构成分析,因为一般化是语言和解决问题技能的一个重要方面。然而,关于数学词问题的类似讨论是有限的。在这个手稿中,我们在解决MWP时研究整体化。具体地说,我们首先采用数据分解方法,从现有的MWP数据集中产生组成分解。与此同时,我们综合数据,分离组成的影响。为了改进 MWP的构成分解,我们建议一种迭代数据扩增方法,在培训数据中包括多种组成变异,并可以与MWP方法合作。在评估过程中,我们研究一套方法,发现所有方法在经过评估的数据集中都受到严重的性能损失。我们还发现,我们的数据扩增方法可以大大改进一般 MWP方法的构成的概括化。代码见https://github.com/demoleiwang/CGMWP。