通过虚拟安全通道驾驶的自治高速公路的监管薄弱强化学习 (Weakly Supervised Reinforcement Learning for Autonomous Highway Driving via Virtual Safety Cages)

The use of neural networks and reinforcement learning has become increasingly popular in autonomous vehicle control. However, the opaqueness of the resulting control policies presents a significant barrier to deploying neural network-based control in autonomous vehicles. In this paper, we present a reinforcement learning based approach to autonomous vehicle longitudinal control, where the rule-based safety cages provide enhanced safety for the vehicle as well as weak supervision to the reinforcement learning agent. By guiding the agent to meaningful states and actions, this weak supervision improves the convergence during training and enhances the safety of the final trained policy. This rule-based supervisory controller has the further advantage of being fully interpretable, thereby enabling traditional validation and verification approaches to ensure the safety of the vehicle. We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance. Additionally, we show that when the model parameters are constrained or sub-optimal, the safety cages can enable a model to learn a safe driving policy even when the model could not be trained to drive through reinforcement learning alone.

翻译：在自主车辆控制中,使用神经网络和强化学习越来越受欢迎。然而,由此产生的控制政策的不透明性为在自主车辆中部署神经网络控制提供了重大障碍。在本文件中,我们提出了一种基于强化学习的自主车辆纵向控制方法,即基于规则的安全笼为车辆提供了更大的安全性,而且对强化学习代理人的监督不力。通过指导代理人采取有意义的状态和行动,这种薄弱的监督提高了培训过程中的趋同性,并加强了最后培训政策的安全性。这一基于规则的监督控制者具有进一步的好处,即完全可以解释,从而使传统的验证和核查方法能够确保车辆安全。我们将模型与安全笼以及具有最佳和受限制模型参数的模型进行比较,并表明薄弱的监督始终能提高车辆的探索安全性、趋同速度和模型性能。此外,我们表明,当模型参数受到制约或不够优化时,安全笼子可以使模型学习安全驾驶政策,即使模型无法通过单靠强化学习来进行驾驶培训。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日