The field of robotic Flexible Endoscopes (FEs) has progressed significantly, offering a promising solution to reduce patient discomfort. However, the limited autonomy of most robotic FEs results in non-intuitive and challenging manoeuvres, constraining their application in clinical settings. While previous studies have employed lumen tracking for autonomous navigation, they fail to adapt to the presence of obstructions and sharp turns when the endoscope faces the colon wall. In this work, we propose a Deep Reinforcement Learning (DRL)-based navigation strategy that eliminates the need for lumen tracking. However, the use of DRL methods poses safety risks as they do not account for potential hazards associated with the actions taken. To ensure safety, we exploit a Constrained Reinforcement Learning (CRL) method to restrict the policy in a predefined safety regime. Moreover, we present a model selection strategy that utilises Formal Verification (FV) to choose a policy that is entirely safe before deployment. We validate our approach in a virtual colonoscopy environment and report that out of the 300 trained policies, we could identify three policies that are entirely safe. Our work demonstrates that CRL, combined with model selection through FV, can improve the robustness and safety of robotic behaviour in surgical applications.
翻译:机器人弹性内心镜(FES)领域已取得重大进展,为减少病人不适症提供了有希望的解决方案。然而,大多数机器人FE的有限自主性导致非直观和具有挑战性的操作,限制了其在临床环境中的应用。虽然以往的研究采用了自动导航的润滑剂跟踪方法,但它们未能适应内心镜面对结肠墙时出现的阻力和锐转。在这项工作中,我们提议了一项基于深度强化学习(DRL)的导航战略,消除了对月球跟踪的需要。然而,使用DRL方法带来了安全风险,因为它们没有考虑到与所采取行动有关的潜在危险。为了确保安全,我们利用了封闭式强化学习(CRL)方法限制在预先确定的安全系统中的政策。此外,我们提出了一个示范选择战略,利用正式核查(FV)来选择在部署前完全安全的政策。我们验证了在虚拟结肠镜环境中的做法,并报告说,在经过培训的政策中,我们可以确定三种政策是完全安全的。我们的工作可以改进CRL的磁性应用。</s>