SHARP: 安全、高效的人类机器人互动防护-软件强力规划 (SHARP: Shielding-Aware Robust Planning for Safe and Efficient Human-Robot Interaction)

Jointly achieving safety and efficiency in human-robot interaction (HRI) settings is a challenging problem, as the robot's planning objectives may be at odds with the human's own intent and expectations. Recent approaches ensure safe robot operation in uncertain environments through a supervisory control scheme, sometimes called "shielding", which overrides the robot's nominal plan with a safety fallback strategy when a safety-critical event is imminent. These reactive "last-resort" strategies (typically in the form of aggressive emergency maneuvers) focus on preserving safety without efficiency considerations; when the nominal planner is unaware of possible safety overrides, shielding can be activated more frequently than necessary, leading to degraded performance. In this work, we propose a new shielding-based planning approach that allows the robot to plan efficiently by explicitly accounting for possible future shielding events. Leveraging recent work on Bayesian human motion prediction, the resulting robot policy proactively balances nominal performance with the risk of high-cost emergency maneuvers triggered by low-probability human behaviors. We formalize Shielding-Aware Robust Planning (SHARP) as a stochastic optimal control problem and propose a computationally efficient framework for finding tractable approximate solutions at runtime. Our method outperforms the shielding-agnostic motion planning baseline (equipped with the same human intent inference scheme) on simulated driving examples with human trajectories taken from the recently released Waymo Open Motion Dataset.

翻译：在人-机器人互动(HRI)环境中共同实现安全和效率是一个具有挑战性的问题,因为机器人的规划目标可能与人类本身的意图和期望不符。最近的一些做法通过监督控制计划确保了在不确定环境中的安全机器人操作,监督控制计划有时被称为“屏蔽 ”,在安全危急事件即将发生时,以安全后退战略取代机器人的名义计划。这些反应式的“最后复苏”战略(通常以攻击性紧急演习的形式)侧重于在不考虑效率因素的情况下维护安全;当名义规划者不知道可能的安全超标时,可以比必要更频繁地启动屏蔽,导致性能退化。在这项工作中,我们提出一种新的基于屏蔽的规划办法,使机器人能够通过明确核算未来可能发生的屏蔽事件来有效规划。在Bayes的人类运动预测中,由此产生的机器人政策积极主动地平衡了名义表现和由低概率人类行为引发的高成本紧急动作的风险。我们正式确定SHelding-Award Robust Plant (SHARHP) 的近期解决方案(SHARDP),可以比更频繁地用于在人类运动的最佳控制框架框架框架上找到一个可操作的人类最佳控制问题。