Improvements in autonomy offer the potential for positive outcomes in a number of domains, yet guaranteeing their safe deployment is difficult. This work investigates how humans can intelligently supervise agents to achieve some level of safety even when performance guarantees are elusive. The motivating research question is: In safety-critical settings, can we avoid the need to have one human supervise one machine at all times? The paper formalizes this 'scaling supervision' problem, and investigates its application to the safety-critical context of autonomous vehicles (AVs) merging into traffic. It proposes a conservative, reachability-based method to reduce the burden on the AVs' human supervisors, which allows for the establishment of high-confidence upper bounds on the supervision requirements in this setting. Order statistics and traffic simulations with deep reinforcement learning show analytically and numerically that teaming of AVs enables supervision time sublinear in AV adoption. A key takeaway is that, despite present imperfections of AVs, supervision becomes more tractable as AVs are deployed en masse. While this work focuses on AVs, the scalable supervision framework is relevant to a broader array of autonomous control challenges.
翻译:自主性的改进有可能在一些领域带来积极的结果,但保障其安全部署是困难的。 这项工作调查了人类如何能够明智地监督代理人达到某种程度的安全,即使绩效保障难以实现。 激励性研究的问题是: 在安全危急的环境中,我们能否避免需要始终有一个人监督一台机器? 该文件将“ 扩大监督”问题正式化,并调查其适用于自动车辆与交通合并的安全危急环境。 它提出了一种保守的、基于可实现性的方法,以减少AV的人类监督者的负担,从而能够建立高度信任的监管要求。 以深入强化学习的方式,从分析角度和数字上显示,AV的组合可以使AV的采用能够监督分线时间。 关键的结果是,尽管AV目前存在缺陷,但监督随着AV的大规模部署而变得更加容易。 这项工作侧重于AV的可扩展性监督框架与更广泛的自主控制挑战相关。