由受监督、不受监督和强化学习产生的对神学视觉投入的不同表述 (Divergent representations of ethological visual inputs emerge from supervised, unsupervised, and reinforcement learning)

Artificial neural systems trained using reinforcement, supervised, and unsupervised learning all acquire internal representations of high dimensional input. To what extent these representations depend on the different learning objectives is largely unknown. Here we compare the representations learned by eight different convolutional neural networks, each with identical ResNet architectures and trained on the same family of egocentric images, but embedded within different learning systems. Specifically, the representations are trained to guide action in a compound reinforcement learning task; to predict one or a combination of three task-related targets with supervision; or using one of three different unsupervised objectives. Using representational similarity analysis, we find that the network trained with reinforcement learning differs most from the other networks. Through further analysis using metrics inspired by the neuroscience literature, we find that the model trained with reinforcement learning has a sparse and high-dimensional representation wherein individual images are represented with very different patterns of neural activity. Further analysis suggests these representations may arise in order to guide long-term behavior and goal-seeking in the RL agent. Our results provide insights into how the properties of neural representations are influenced by objective functions and can inform transfer learning approaches.

翻译：利用强化、监督和不受监督的学习,经过培训的人工神经系统都获得高维投入的内部表现。这些表现在多大程度上取决于不同的学习目标,基本上不为人所知。在这里,我们比较了八个不同的进化神经网络所学的表述,每个网络都具有相同的ResNet结构,在自我中心图像的同一家庭里受过同样的培训,但都嵌入不同的学习系统。具体地说,这些表述受过培训,以指导复合强化学习任务的行动;预测三个任务相关目标之一或结合监督;或使用三个不同但不受监督的目标之一。使用代表性相似性分析,我们发现受过强化学习培训的网络与其他网络相比差异最大。我们发现,通过使用神经科学文献启发的衡量标准进行进一步分析,经过强化学习培训的模型具有稀疏和高维度的表述,其中个人图像具有非常不同的神经活动模式。进一步分析表明,这些表述可能出现,以指导RL代理的长期行为和目标探索。我们的结果提供了对神经表现特征如何受到客观功能影响以及能够传递学习方法的洞察力。