The ability to acquire abstract knowledge is a hallmark of human intelligence and is believed by many to be one of the core differences between humans and neural network models. Agents can be endowed with an inductive bias towards abstraction through meta-learning, where they are trained on a distribution of tasks that share some abstract structure that can be learned and applied. However, because neural networks are hard to interpret, it can be difficult to tell whether agents have learned the underlying abstraction, or alternatively statistical patterns that are characteristic of that abstraction. In this work, we compare the performance of humans and agents in a meta-reinforcement learning paradigm in which tasks are generated from abstract rules. We define a novel methodology for building "task metamers" that closely match the statistics of the abstract tasks but use a different underlying generative process, and evaluate performance on both abstract and metamer tasks. In our first set of experiments, we found that humans perform better at abstract tasks than metamer tasks whereas a widely-used meta-reinforcement learning agent performs worse on the abstract tasks than the matched metamers. In a second set of experiments, we base the tasks on abstractions derived directly from empirically identified human priors. We utilize the same procedure to generate corresponding metamer tasks, and see the same double dissociation between humans and agents. This work provides a foundation for characterizing differences between humans and machine learning that can be used in future work towards developing machines with human-like behavior.
翻译:获取抽象知识的能力是人类智力的标志,许多人认为,这种能力是人类和神经网络模型之间的核心差异之一。代理商可以被赋予通过元学习对抽象的抽象化的感化偏向,在这种过程中,他们接受的是有关分配具有某些可学习和应用的抽象结构的任务的培训。然而,由于神经网络很难解释,因此很难判断代理商是否学到了基本抽象化,或者是属于抽象化特征的统计模式。在这项工作中,我们比较了人类和代理商在一种元强化学习模式中的表现,该模式中的任务来自抽象规则。我们为建立“task meamers”确定了一种新的方法,该方法与抽象任务的统计数据密切吻合,但使用不同的基因化过程,并评估抽象和元化任务的业绩。在我们第一组实验中,我们发现人类的抽象任务比元化任务要好,而广泛使用的元力学习代理商则比相匹配的计量标准要差。在第二组实验中,我们为“task memater ” 定义了一种新型的“task meam ” 方法,我们把任务直接建立在先前的抽象和模型基础上。我们利用了人类的模型工作,从人类的模型中得出了相同的过程。我们利用了人类的模型。