Recent advances in reinforcement learning (RL) have increased the promise of introducing cognitive assistance and automation to robot-assisted laparoscopic surgery (RALS). However, progress in algorithms and methods depends on the availability of standardized learning environments that represent skills relevant to RALS. We present LapGym, a framework for building RL environments for RALS that models the challenges posed by surgical tasks, and sofa_env, a diverse suite of 12 environments. Motivated by surgical training, these environments are organized into 4 tracks: Spatial Reasoning, Deformable Object Manipulation & Grasping, Dissection, and Thread Manipulation. Each environment is highly parametrizable for increasing difficulty, resulting in a high performance ceiling for new algorithms. We use Proximal Policy Optimization (PPO) to establish a baseline for model-free RL algorithms, investigating the effect of several environment parameters on task difficulty. Finally, we show that many environments and parameter configurations reflect well-known, open problems in RL research, allowing researchers to continue exploring these fundamental problems in a surgical context. We aim to provide a challenging, standard environment suite for further development of RL for RALS, ultimately helping to realize the full potential of cognitive surgical robotics. LapGym is publicly accessible through GitHub (https://github.com/ScheiklP/lap_gym).
翻译:在强化学习(RL)方面,最近的进展增加了向机器人辅助的腹腔外科手术(RALS)提供认知协助和自动化的前景。然而,算法和方法的进展取决于能否提供代表RALS相关技能的标准化学习环境。我们向RALS展示了LapGym,这是为RALS建立RL环境以模拟外科任务所构成挑战的框架,Safa_env是一个由12个环境组成的多样化组合。在外科培训的推动下,这些环境被分为4个轨道:空间理性、可变化的物体操纵和磨损、解剖和Tread 操纵。每种环境都高度可视于日益增加的困难,导致新算法的性能上限很高。我们利用Proximal政策优化(POPO)为无型RLL算法建立基线,调查若干环境参数对任务困难的影响。最后,我们显示,许多环境和参数配置反映了RLL研究中众所周知的公开问题,使研究人员能够继续探索这些基本问题。我们的目标是为RALS公司提供具有挑战性、标准性、通过GMLS进一步的公共发展。