人类和机器结构化任务分配的元学习 (Meta-Learning of Structured Task Distributions in Humans and Machines)

In recent years, meta-learning, in which a model is trained on a family of tasks (i.e. a task distribution), has emerged as an approach to training neural networks to perform tasks that were previously assumed to require structured representations, making strides toward closing the gap between humans and machines. However, we argue that evaluating meta-learning remains a challenge, and can miss whether meta-learning actually uses the structure embedded within the tasks. These meta-learners might therefore still be significantly different from humans learners. To demonstrate this difference, we first define a new meta-reinforcement learning task in which a structured task distribution is generated using a compositional grammar. We then introduce a novel approach to constructing a "null task distribution" with the same statistical complexity as this structured task distribution but without the explicit rule-based structure used to generate the structured task. We train a standard meta-learning agent, a recurrent network trained with model-free reinforcement learning, and compare it with human performance across the two task distributions. We find a double dissociation in which humans do better in the structured task distribution whereas agents do better in the null task distribution -- despite comparable statistical complexity. This work highlights that multiple strategies can achieve reasonable meta-test performance, and that careful construction of control task distributions is a valuable way to understand which strategies meta-learners acquire, and how they might differ from humans.

翻译：近些年来,元学习模式(即任务分配模式)在任务组合(即任务分配)上得到了培训,作为培训神经网络以完成以前假定需要结构化表述的任务的一种方法,在缩小人与机器之间的差距方面迈出了一大步。然而,我们认为,评价元学习仍然是一个挑战,而且可能忽略元学习是否实际使用了任务中所包含的结构。因此,这些元学习者可能仍然与人类学习者有很大不同。为了显示这一差异,我们首先确定了一个新的元强化学习任务,其中利用组成语法来产生结构化的任务分配。然后我们引入了一种新的方法,以构建一个与结构化任务分配相同的统计复杂性的“核心任务分配”,而没有用于产生结构化任务的明确的基于规则的结构结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化的结构化结构化结构化结构化结构化结构化结构化结构化结构化任务分配。我们训练了一个标准的元学习代理机构,一个经过没有模型强化学习训练的经常性网络,并在两个任务分布中将其与人类业绩对比。我们发现一种双重的分化,在其中,人类在结构化任务分配结构化任务分配中会更好进行结构化任务分配中,而在结构化任务分配中产生结构化任务分配中产生结构化任务结构化任务结构化任务分配中产生一种结构化任务分配,而代理机构化任务分配,而代理者在这种结构化工作上会做得更好,而代理人则会更好,而在这种结构化分配中会更好,而这种结构化的这种结构化的这种结构化分配中会更好,而代理人则会更好,而这种结构化式式地理解不同的结构化的方法是比较不同的结构化式分配,尽管统计性化的复杂性化的方法性化结构式式式式式的格局式式式变,尽管统计性化的复杂性化的格局式分配,尽管有不同的结构化结构化结构化结构化结构化结构化结构化的工作式结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化结构化的工作式结构化结构化的格局化的格局式分配,尽管有不同的结构化结构化工作是不同的结构化的格局化的格局化的格局化的格局化的格局化的格局化的格局式的格局化的格局化