持续控制自主船只深强化学习 (Continuous Control with Deep Reinforcement Learning for Autonomous Vessels)

Maritime autonomous transportation has played a crucial role in the globalization of the world economy. Deep Reinforcement Learning (DRL) has been applied to automatic path planning to simulate vessel collision avoidance situations in open seas. End-to-end approaches that learn complex mappings directly from the input have poor generalization to reach the targets in different environments. In this work, we present a new strategy called state-action rotation to improve agent's performance in unseen situations by rotating the obtained experience (state-action-state) and preserving them in the replay buffer. We designed our model based on Deep Deterministic Policy Gradient, local view maker, and planner. Our agent uses two deep Convolutional Neural Networks to estimate the policy and action-value functions. The proposed model was exhaustively trained and tested in maritime scenarios with real maps from cities such as Montreal and Halifax. Experimental results show that the state-action rotation on top of the CVN consistently improves the rate of arrival to a destination (RATD) by up 11.96% with respect to the Vessel Navigator with Planner and Local View (VNPLV), as well as it achieves superior performance in unseen mappings by up 30.82%. Our proposed approach exhibits advantages in terms of robustness when tested in a new environment, supporting the idea that generalization can be achieved by using state-action rotation.

翻译：在世界经济全球化中,自主海运在世界经济全球化中发挥了关键作用。深度强化学习(DRL)已应用于模拟公海避免船舶碰撞情况的自动路径规划。直接从投入中学习复杂测绘的端到端方法没有很好地概括到在不同环境中达到目标。在这项工作中,我们提出了一个新的战略,称为州-行动轮换,通过轮换获得的经验(州-行动状态)和在重播缓冲中保留这些经验来改善代理人在隐蔽情况下的表现。我们设计了基于深度威慑政策梯度梯度梯度、本地造影器和规划器的模型。我们的代理利用两个深层革命神经网络来估计政策和行动价值功能。拟议的模型在海洋情景中经过详尽的培训和测试,使用蒙特利尔和哈利法克斯等城市的真实地图。实验结果显示,在CVN顶端进行州-行动轮换,不断提高到达目的地的速度(RATD)11.96 %,与规划员和当地视图(VNPLV)相比,我们的代理人利用两个深层神经网络来评估政策和行动价值功能。拟议的模型经过详尽的训练和测试,通过在30年期的视野中实现高超视距定位,从而实现高视定位,从而在环境中实现高视定位的优势,从而实现高超视定位,从而在全景中实现了高视定位方法,从而可以测试,从而实现了高视能定位,从而实现了获得了获得了了我们在全景。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

专知会员服务

87+阅读 · 2020年5月11日

【CMU课程：深度学习导论(Spring 2020)】“11-785 Introduction to Deep Learning | Carnegie Mellon University | Spring 2020” by Bhiksha Raj

专知会员服务

29+阅读 · 2020年2月3日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

TensorFlow深度学习，从线性回归到强化学习的深度学习（TensorFlow for Deep Learning From Linear Regression to Reinforcement Learning），附页256页pdf

专知会员服务

46+阅读 · 2020年1月1日