We propose to build a reinforcement learning prover of independent components: a deductive system (an environment), the proof state representation (how an agent sees the environment), and an agent training algorithm. To that purpose, we contribute an additional Vampire-based environment to $\texttt{gym-saturation}$ package of OpenAI Gym environments for saturation provers. We demonstrate a prototype of using $\texttt{gym-saturation}$ together with a popular reinforcement learning framework (Ray $\texttt{RLlib}$). Finally, we discuss our plans for completing this work in progress to a competitive automated theorem prover.
翻译:我们建议建立一个独立组成部分的强化学习证明:一个减税系统(环境),证明国家代表(代理人如何看待环境),以及代理培训算法。为此,我们为用于饱和验证的OpenAI Gym环境的$textt{gym-饱和度提供额外的基于吸血鬼的环境。我们展示了一个使用$tutt{gym-饱和度的$的原型和一个普及强化学习框架(Ray $\textt{RLlib}$ ) 。 最后,我们讨论了我们完成这项工作的计划,以建立一个有竞争力的自动化理论证明。