跟踪动荡羽流的人工剂中新出现的行为和神经动态 (Emergent behavior and neural dynamics in artificial agents tracking turbulent plumes)

Tracking a turbulent plume to locate its source is a complex control problem because it requires multi-sensory integration and must be robust to intermittent odors, changing wind direction, and variable plume statistics. This task is routinely performed by flying insects, often over long distances, in pursuit of food or mates. Several aspects of this remarkable behavior have been studied in detail in many experimental studies. Here, we take a complementary in silico approach, using artificial agents trained with reinforcement learning to develop an integrated understanding of the behaviors and neural computations that support plume tracking. Specifically, we use deep reinforcement learning (DRL) to train recurrent neural network (RNN) agents to locate the source of simulated turbulent plumes. Interestingly, the agents' emergent behaviors resemble those of flying insects, and the RNNs learn to represent task-relevant variables, such as head direction and time since last odor encounter. Our analyses suggest an intriguing experimentally testable hypothesis for tracking plumes in changing wind direction -- that agents follow local plume shape rather than the current wind direction. While reflexive short-memory behaviors are sufficient for tracking plumes in constant wind, longer timescales of memory are essential for tracking plumes that switch direction. At the level of neural dynamics, the RNNs' population activity is low-dimensional and organized into distinct dynamical structures, with some correspondence to behavioral modules. Our in silico approach provides key intuitions for turbulent plume tracking strategies and motivates future targeted experimental and theoretical developments.

翻译：跟踪动荡的卷流以找到其源头是一个复杂的控制问题, 因为它需要多感知整合, 并且必须能够对间歇性气味、改变风方向和变化羽流统计进行稳健。这项任务通常由飞行昆虫执行, 通常是长途的飞虫, 以追求食物或伴侣。许多实验研究都详细研究了这一惊人行为的几个方面。我们在这里在硅基方法中采取补充措施, 使用经过强化培训的人工代理器学习, 以综合理解支持卷流跟踪的行为和神经计算。具体地说, 我们利用深度强化学习( DRL) 来训练经常性神经网络( RNNN) 目标代理器, 以找到模拟动荡卷流卷卷卷卷卷卷卷卷的源。有趣的是, 代理人的突发行为类似于飞昆虫的行为, 以及 RNNNS 学会代表与任务相关的变量, 例如上次气味遇到的时头和时间。我们的分析表明, 追踪风向变化中的流流流流的实验性模型假设比当前风向方向。。反应性直径直径直径直径直径对流的内, 的内, 的内流的内流行为是持续的不断的内流运动运动的内流流运动的不断的内流运动运动运动动作, 。