Solving symbolic mathematics has always been of in the arena of human ingenuity that needs compositional reasoning and recurrence. However, recent studies have shown that large-scale language models such as transformers are universal and surprisingly can be trained as a sequence-to-sequence task to solve complex mathematical equations. These large transformer models need humongous amounts of training data to generalize to unseen symbolic mathematics problems. In this paper, we present a sample efficient way of solving the symbolic tasks by first pretraining the transformer model with language translation and then fine-tuning the pretrained transformer model to solve the downstream task of symbolic mathematics. We achieve comparable accuracy on the integration task with our pretrained model while using around $1.5$ orders of magnitude less number of training samples with respect to the state-of-the-art deep learning for symbolic mathematics. The test accuracy on differential equation tasks is considerably lower comparing with integration as they need higher order recursions that are not present in language translations. We propose the generalizability of our pretrained language model from Anna Karenina Principle (AKP). We pretrain our model with different pairs of language translations. Our results show language bias in solving symbolic mathematics tasks. Finally, we study the robustness of the fine-tuned model on symbolic math tasks against distribution shift, and our approach generalizes better in distribution shift scenarios for the function integration.
翻译:解决符号数学始终存在于人类智慧的领域中,需要进行构思推理和重现。然而,最近的研究表明,大型语言模型,如变压器等,是通用的,令人惊讶地可以作为解决复杂数学方程的顺序和顺序任务来培训。这些大型变压器模型需要大量的培训数据,以便概括到看不见的象征性数学问题。在本文件中,我们提出了一个解决象征性任务的样本,先先用语言翻译对变压器模型进行初步培训,然后对预先训练的变压器模型进行微调,以解决符号数学的下游任务。我们在整合任务方面实现了与我们预先训练的变压模型的相似的准确性,同时使用约1.5美元数量较少的培训样本解决复杂的数学方程问题。这些变压式模型的测试准确性要远远低于整合,因为它们需要更高顺序的重复,而语言翻译中则没有出现。我们建议从安娜·卡列尼纳原则(AKP)开始,先行将我们预先训练的语言模型的通用性模型与我们预先训练过的模拟语言翻译的不同组合的精确性。我们最终在数学的数学上展示了数学分布上的精确性分析,在数学上展示了我们数学的数学的精确性分析。最后,在数学上展示了我们数学的数学的数学的数学分布,我们对等的精确性研究中,对等的数学的精确性研究。</s>