Geometry problem solving (GPS) is a high-level mathematical reasoning requiring the capacities of multi-modal fusion and geometric knowledge application. Recently, neural solvers have shown great potential in GPS but still be short in diagram presentation and modal fusion. In this work, we convert diagrams into basic textual clauses to describe diagram features effectively, and propose a new neural solver called PGPSNet to fuse multi-modal information efficiently. Combining structural and semantic pre-training, data augmentation and self-limited decoding, PGPSNet is endowed with rich knowledge of geometry theorems and geometric representation, and therefore promotes geometric understanding and reasoning. In addition, to facilitate the research of GPS, we build a new large-scale and fine-annotated GPS dataset named PGPS9K, labeled with both fine-grained diagram annotation and interpretable solution program. Experiments on PGPS9K and an existing dataset Geometry3K validate the superiority of our method over the state-of-the-art neural solvers. The code and dataset will be public available soon.
翻译:解决几何问题是一项高层次的数学推理,要求具备多式聚合能力和几何知识应用能力。最近,神经求解器在全球定位系统中表现出巨大的潜力,但在图表演示和模型聚合方面仍然很短。在这项工作中,我们将图表转换为基本文字条款,以有效描述图表特征,并提议一个新的神经求解器,称为PGPSNet,以有效整合多式信息。将结构学和语义学前训练、数据增强和自定义解码结合起来,PGPSNet拥有丰富的几何论理论和几何代表学学知识,因此促进几何理解和推理。此外,为了便利全球定位系统的研究,我们将建立一个称为PGPS9K的新的大型和精细加注解的全球定位系统数据集,标有精细的图表注解和可解释的解决方案程序。PGPPS9K实验和现有的几何数据组3K将证实我们的方法优于状态神经求解器。该代码和数据集将很快公布。