While showing impressive performance on various kinds of learning tasks, it is yet unclear whether deep learning models have the ability to robustly tackle reasoning tasks. than by learning the underlying reasoning process that is actually required to solve the tasks. Measuring the robustness of reasoning in machine learning models is challenging as one needs to provide a task that cannot be easily shortcut by exploiting spurious statistical correlations in the data, while operating on complex objects and constraints. reasoning task. To address this issue, we propose ChemAlgebra, a benchmark for measuring the reasoning capabilities of deep learning models through the prediction of stoichiometrically-balanced chemical reactions. ChemAlgebra requires manipulating sets of complex discrete objects -- molecules represented as formulas or graphs -- under algebraic constraints such as the mass preservation principle. We believe that ChemAlgebra can serve as a useful test bed for the next generation of machine reasoning models and as a promoter of their development.
翻译:虽然在各种学习任务上表现出令人印象深刻的成绩,但尚不清楚深层次学习模式是否有能力强有力地应对推理任务,而不是学习解决任务实际需要的基本推理过程。衡量机器学习模型中推理的稳健性具有挑战性,因为人们需要通过利用数据中虚假的统计相关性,同时在复杂的物体和制约因素上操作,来提供一项不易捷径的任务。为解决这一问题,我们提议ChemAlgebra,这是一个基准,用来通过预测对等化学反应来测量深层次学习模型的推理能力。ChemAlgebra需要在诸如质量保护原则等代数限制下,操纵一组复杂的离散物体 -- -- 以公式或图表为代表的分子 -- -- 来进行操纵。我们认为ChemAlgebra可以作为下一代机器推理模型的有用测试台,并成为其发展的促进者。