Strong inductive biases give humans the ability to quickly learn to perform a variety of tasks. Although meta-learning is a method to endow neural networks with useful inductive biases, agents trained by meta-learning may sometimes acquire very different strategies from humans. We show that co-training these agents on predicting representations from natural language task descriptions and programs induced to generate such tasks guides them toward more human-like inductive biases. Human-generated language descriptions and program induction models that add new learned primitives both contain abstract concepts that can compress description length. Co-training on these representations result in more human-like behavior in downstream meta-reinforcement learning agents than less abstract controls (synthetic language descriptions, program induction without learned primitives), suggesting that the abstraction supported by these representations is key.
翻译:强烈的感官偏见使人类能够迅速学会执行各种任务。 虽然元学习是给神经网络注入有用的感官偏见的一种方法,但接受元学习培训的代理可能有时从人类那里获得非常不同的战略。 我们表明,共同培训这些代理人员从自然语言任务描述和生成这类任务的程序中预测表述,引导他们走向更像人类的感官偏见。 由人类产生的语言描述和方案上岗模型,添加新学过的原始语言描述和方案上岗模型都包含可以压缩描述长度的抽象概念。 就这些表述进行共同培训的结果是,下游超强力学习代理中更像人类的行为,而不是较不那么抽象的控制(合成语言描述,不学习原始元素的程序上岗),这表明这些描述所支持的抽象概念是关键所在。