Acquiring mathematical skills is considered a key challenge for modern Artificial Intelligence systems. Inspired by the way humans discover numerical knowledge, here we introduce a deep reinforcement learning framework that allows to simulate how cognitive agents could gradually learn to solve arithmetic problems by interacting with a virtual abacus. The proposed model successfully learn to perform multi-digit additions and subtractions, achieving an error rate below 1% even when operands are much longer than those observed during training. We also compare the performance of learning agents receiving a different amount of explicit supervision, and we analyze the most common error patterns to better understand the limitations and biases resulting from our design choices.
翻译:获取数学技能被认为是现代人造智能系统面临的关键挑战。 受人类发现数字知识的方式的启发,我们在此引入了一个深强化学习框架,可以模拟认知代理人如何通过虚拟算盘的相互作用逐渐学会解决算术问题。 拟议的模型成功地学会了多位数的增减,即使手术比培训中观察到的要长得多,也达到了低于1%的误差率。 我们还比较了接受不同程度的明确监督的学习代理人的表现,我们分析了最常见的错误模式,以更好地了解我们设计选择所产生的限制和偏差。