Safe control methods are often intended to behave safely even in worst-case human uncertainties. However, humans may exploit such safety-first systems, which results in greater risk for everyone. Despite their significance, no prior work has investigated and accounted for such factors in safe control. In this paper, we leverage an interaction-based payoff structure from game theory to model humans' short-sighted, self-seeking behaviors and how humans change their strategies toward machines based on prior experience. We integrate such strategic human behaviors into a safe control architecture. As a result, our approach achieves better safety and performance trade-offs when compared to both deterministic worst-case safe control techniques and equilibrium-based stochastic methods. Our findings suggest an urgent need to fundamentally rethink the safe control framework used in human-technology interaction in pursuit of greater safety for all.
翻译:然而,人类可能会利用这种安全第一系统,从而给每个人带来更大的风险。尽管这些系统意义重大,但此前没有任何工作在安全控制方面对此类因素进行了调查和核算。在本文件中,我们利用一种基于互动的补偿结构,从游戏理论到模拟人类短视、自我追求的行为,以及人类如何根据以往的经验改变对机器的战略。我们把这种人类战略行为纳入一个安全控制架构。因此,我们的方法实现了更好的安全和性能权衡,与确定性最坏的安全控制技术和平衡分析方法相比。我们的调查结果表明,迫切需要从根本上重新思考在人类技术互动中所使用的安全控制框架,以便为所有人寻求更大的安全。