This paper introduces a computational model of creative problem solving in deep reinforcement learning agents, inspired by cognitive theories of creativity. The AIGenC model aims at enabling artificial agents to learn, use and generate transferable representations. AIGenC is embedded in a deep learning architecture that includes three main components: concept processing, reflective reasoning, and blending of concepts. The first component extracts objects and affordances from sensory input and encodes them in a concept space, represented as a hierarchical graph structure. Concept representations are stored in a dual memory system. Goal-directed and temporal information acquired by the agent during deep reinforcement learning enriches the representations creating a higher-level of abstraction in the concept space. In parallel, a process akin to reflective reasoning detects and recovers from memory concepts relevant to the task according to a matching process that calculates a similarity value between the current state and memory graph structures. Once an interaction is finalised, rewards and temporal information are added to the graph structure, creating a higher abstraction level. If reflective reasoning fails to offer a suitable solution, a blending process comes into place to create new concepts by combining past information. We discuss the model's capability to yield better out-of-distribution generalisation in artificial agents, thus advancing toward artificial general intelligence. To the best of our knowledge, this is the first computational model, beyond mere formal theories, that posits a solution to creative problem solving within a deep learning architecture.
翻译:本文引入了在深强化学习代理中解决创新问题的计算模型。 AIGenC 模型旨在让人工代理者能够学习、使用和生成可转移的演示。 AIGenC 嵌入一个包含三个主要组成部分的深层次学习结构: 概念处理、 反思推理和概念混合。 第一个组成部分从感官输入中提取对象和负担,并将它们编码在一个概念空间中,以等级图结构的形式表示。 概念表述存储在双重记忆系统中。 由该代理者在深强化学习期间获得的目标导向和时间信息丰富了在概念空间中创造更高层次抽象的表述。 与此同时, 一个类似于反映推理的过程,在与任务相关的记忆概念中探测和恢复。 匹配过程计算了当前状态和记忆图结构之间的类似价值。 一旦互动最终确定, 奖励和时间信息被添加到图形结构中, 创造出更高的抽象程度。 如果深强化推理无法提供合适的解决方案, 简单的推理过程会丰富了在概念中创造更深层次的概念, 通过将过去的信息再整合, 因此, 我们的模型的模型的模型的模型将进入了一个总体的模型, 走向一个总体的计算。