In percutaneous intervention for treatment of coronary plaques, guidewire navigation is a primary procedure for stent delivery. Steering a flexible guidewire within coronary arteries requires considerable training, and the non-linearity between the control operation and the movement of the guidewire makes precise manipulation difficult. Here, we introduce a deep reinforcement learning(RL) framework for autonomous guidewire navigation in a robot-assisted coronary intervention. Using Rainbow, a segment-wise learning approach is applied to determine how best to accelerate training using human demonstrations with deep Q-learning from demonstrations (DQfD), transfer learning, and weight initialization. `State' for RL is customized as a focus window near the guidewire tip, and subgoals are placed to mitigate a sparse reward problem. The RL agent improves performance, eventually enabling the guidewire to reach all valid targets in `stable' phase. Our framework opens anew direction in the automation of robot-assisted intervention, providing guidance on RL in physical spaces involving mechanical fatigue.
翻译:在治疗冠心形时,导线导航是一种主要的导线交付程序。指导冠心动的灵活导线需要大量培训,控制操作与导线移动之间的非线性使精确的操纵变得困难。在这里,我们为机器人协助的冠心动干预中自主导线导航引入了深度强化学习(RL)框架。利用彩虹,采用了一种分级学习方法,以确定如何最佳地利用人类演示加速培训,从演示(DQfD)、转移学习和重量初始化等深Q学习(DQfD)、转移学习和重量初始化。“RL State”是作为在导线提示附近一个焦点窗口定制的,次级目标用于缓解微小的奖励问题。RL代理提高性能,最终使导线能够在“稳定”阶段达到所有有效目标。我们的框架开启了机器人辅助干预自动化的新方向,为涉及机械疲劳的物理空间的RL提供了指导。