In this paper we present a novel method for a naive agent to detect novel objects it encounters in an interaction. We train a reinforcement learning policy on a stacking task given a known object type, and then observe the results of the agent attempting to stack various other objects based on the same trained policy. By extracting embedding vectors from a convolutional neural net trained over the results of the aforementioned stacking play, we can determine the similarity of a given object to known object types, and determine if the given object is likely dissimilar enough to the known types to be considered a novel class of object. We present the results of this method on two datasets gathered using two different policies and demonstrate what information the agent needs to extract from its environment to make these novelty judgments.
翻译:在本文中,我们提出了一个新颖的方法,让天真的代理器能够检测它在互动中遇到的新物体。我们针对已知的物体类型,对堆叠任务进行了强化学习政策培训,然后观察该代理器试图根据同一经过培训的政策堆叠其他各种物体的结果。通过从经过上述堆叠游戏结果培训的进化神经网中提取嵌入矢量,我们可以确定某一天体与已知物体类型的相似性,并确定给定对象是否与已知的物体类型相异,是否被视为一种新型物体。我们用两种不同的政策对收集的两套数据集介绍了这种方法的结果,并演示该代理器需要从环境中提取哪些信息来做出这些新颖的判断。