If we changed the rules, would the wise become fools? Different groups formalize reinforcement learning (RL) in different ways. If an agent in one RL framework is to run within another RL framework's environments, the agent must first be converted, or mapped, into that other framework. Whether or not this is possible depends on the RL frameworks in question and on how intelligence is measured. In this paper, we lay foundations for studying relative-intelligence-preserving mappability between RL frameworks.
翻译:如果我们改变规则,明智者会变成愚人吗?不同的团体以不同的方式正式确定强化学习(RL)方式。如果一个RL框架中的代理人要在另一个RL框架中运行,那么该代理人必须首先转换或映射到另一个框架中。这是否可能取决于有关的RL框架和如何衡量情报。在本文中,我们为研究相对情报-保存RL框架中的映像性打下了基础。