智能磁性微机器人学会与深强化学习游泳 (Smart Magnetic Microrobots Learn to Swim with Deep Reinforcement Learning)

Swimming microrobots are increasingly developed with complex materials and dynamic shapes and are expected to operate in complex environments in which the system dynamics are difficult to model and positional control of the microrobot is not straightforward to achieve. Deep reinforcement learning is a promising method of autonomously developing robust controllers for creating smart microrobots, which can adapt their behavior to operate in uncharacterized environments without the need to model the system dynamics. Here, we report the development of a smart helical magnetic hydrogel microrobot that used the soft actor critic reinforcement learning algorithm to autonomously derive a control policy which allowed the microrobot to swim through an uncharacterized biomimetic fluidic environment under control of a time varying magnetic field generated from a three-axis array of electromagnets. The reinforcement learning agent learned successful control policies with fewer than 100,000 training steps, demonstrating sample efficiency for fast learning. We also demonstrate that we can fine tune the control policies learned by the reinforcement learning agent by fitting mathematical functions to the learned policy's action distribution via regression. Deep reinforcement learning applied to microrobot control is likely to significantly expand the capabilities of the next generation of microrobots.

翻译：游泳微型机器人正在越来越多地以复杂的材料和动态形状发展成游泳微型机器人,并预计将在复杂的环境中运作,因为在这种环境中,系统动态难以建模,微机器人的定位控制并非直截了当。深层强化学习是自主开发强大的控制器以创造智能微型机器人的一种有希望的方法,它可以调整其行为,使其在没有特征的环境中运作,而无需模拟系统动态。在这里,我们报告开发了一个智能的直升机磁磁性水相微机器人,它利用软行为者评论家强化学习算法自主地得出一种控制政策,使微机器人能够在三轴电磁网生成的不同时间磁场的控制下,通过一个未精密的生物模拟流化环境游泳。强化学习器学习成功的控制政策,其培训步骤少于10万个,展示快速学习的样本效率。我们还表明,我们可以通过将数学功能与通过回归而学习的政策行动分布相匹配,从而精密地调整通过强化学习剂学到的控制政策政策。微机器人新一代的深度强化学习可能大大扩展微机器人的生成能力。