We start by discussing the link between ecosystem simulators and general AI. Then we present the open-source ecosystem simulator Ecotwin, which is based on the game engine Unity and operates on ecosystems containing inanimate objects like mountains and lakes, as well as organisms such as animals and plants. Animal cognition is modeled by integrating three separate networks: (i) a reflex network for hard-wired reflexes; (ii) a happiness network that maps sensory data such as oxygen, water, energy, and smells, to a scalar happiness value; and (iii) a policy network for selecting actions. The policy network is trained with reinforcement learning (RL), where the reward signal is defined as the happiness difference from one time step to the next. All organisms are capable of either sexual or asexual reproduction, and they die if they run out of critical resources. We report results from three studies with Ecotwin, in which natural phenomena emerge in the models without being hardwired. First, we study a terrestrial ecosystem with wolves, deer, and grass, in which a Lotka-Volterra style population dynamics emerges. Second, we study a marine ecosystem with phytoplankton, copepods, and krill, in which a diel vertical migration behavior emerges. Third, we study an ecosystem involving lethal dangers, in which certain agents that combine RL with reflexes outperform pure RL agents.
翻译:我们先讨论生态系统模拟器和一般AI之间的联系。 然后我们展示开放源码生态系统模拟器 Ecotwin, 其基础是游戏引擎团结, 并在含有山脉和湖泊等无生命物体以及动植物等生物体的生态系统上运行。 动物认知模型是通过整合三个不同的网络来建模的:(一) 硬动力反射反射反应网络;(二) 将氧气、水、能源和气味等感官数据映射为卡路里幸福值的幸福网络;以及(三) 选择行动的政策网络。 政策网络经过强化学习(RL)培训, 奖励信号被定义为从一步到下一步的幸福差异。 所有生物都能够性或性繁殖,如果它们从关键资源流出,它们就会死亡。 我们报告与Ecotwin的三项研究的结果, 其中自然现象出现在模型中,没有硬线上。 首先, 我们研究一个陆地生态系统, deer, 和草的地球生态系统, 其中罗特- Volera 风格的人口动态与直观上出现一种直观的动态, 我们研究一个直观的生态系统, 与直观的生态系统, 一起研究, 与直观的直观的生态系统, 研究。</s>