L\'evy walks and other theoretical models of optimal foraging have been successfully used to describe real-world scenarios, attracting attention in several fields such as economy, physics, ecology, and evolutionary biology. However, it remains unclear in most cases which strategies maximize foraging efficiency and whether such strategies can be learned by living organisms. To address these questions, we model foragers as reinforcement learning agents. We first prove theoretically that maximizing rewards in our reinforcement learning model is equivalent to optimizing foraging efficiency. We then show with numerical experiments that our agents learn foraging strategies which outperform the efficiency of known strategies such as L\'evy walks.
翻译:摘要:Lévy 漫步和其他对最优觅食的理论模型已经成功地用于描述现实世界中的情境,并在经济学、物理学、生态学和进化生物学等多个领域吸引了关注。然而,在大多数情况下,最大化觅食效率的策略以及这些策略是否能被生物学习仍不清楚。为了解决这些问题,我们将觅食者建模为强化学习智能体。我们首先在理论上证明了在我们的强化学习模型中最大化奖励等价于优化觅食效率。然后,我们通过数字实验展示了我们的智能体学习到的觅食策略优于 Lévy 漫步等已知策略的效率。