Autonomous marine vehicles play an essential role in many ocean science and engineering applications. Planning time and energy optimal paths for these vehicles to navigate in stochastic dynamic ocean environments is essential to reduce operational costs. In some missions, they must also harvest solar, wind, or wave energy (modeled as a stochastic scalar field) and move in optimal paths that minimize net energy consumption. Markov Decision Processes (MDPs) provide a natural framework for sequential decision-making for robotic agents in such environments. However, building a realistic model and solving the modeled MDP becomes computationally expensive in large-scale real-time applications, warranting the need for parallel algorithms and efficient implementation. In the present work, we introduce an efficient end-to-end GPU-accelerated algorithm that (i) builds the MDP model (computing transition probabilities and expected one-step rewards); and (ii) solves the MDP to compute an optimal policy. We develop methodical and algorithmic solutions to overcome the limited global memory of GPUs by (i) using a dynamic reduced-order representation of the ocean flows, (ii) leveraging the sparse nature of the state transition probability matrix, (iii) introducing a neighbouring sub-grid concept and (iv) proving that it is sufficient to use only the stochastic scalar field's mean to compute the expected one-step rewards for missions involving energy harvesting from the environment; thereby saving memory and reducing the computational effort. We demonstrate the algorithm on a simulated stochastic dynamic environment and highlight that it builds the MDP model and computes the optimal policy 600-1000x faster than conventional CPU implementations, making it suitable for real-time use.
翻译:自主海运车辆在许多海洋科学和工程应用中发挥着必不可少的作用。规划这些车辆在随机动态海洋环境中航行的时间和能源最佳途径对于降低操作成本至关重要。在有些特派团,它们还必须收获太阳能、风能或波能(制成蒸汽电路),并在最佳途径中最大限度地减少能源净消耗量。Markov 决策程序(MDPs)为此类环境中机器人剂的顺序决策提供了一个自然框架。然而,为这些车辆建造现实模型和解决模型型MDP,在大规模实时应用中计算成本昂贵,需要平行的算法和高效的实施。在目前的工作中,我们引入高效的终端到终端的GPUP-加速算法(制成为蒸汽电动的电动电动电动电算法),建立MDP模型(推算出模型),用以计算模型的最佳政策。我们开发方法和算法解决方案,以克服全球通用电车的有限动态记忆,为此(i) 仅使用动态的降序算法算法和高效的算法算法,从而建立平行的算法算算法,从而推算出海洋流流流流流的精度,从而推推利用一个精度的精度,将精度转换为精度,将精度转换为精度,将精度的精度转换为精度的精度,将精度的精度的精度转化为的精度转化为的精度,将精度转化为的精度转化为的精度,将精度转化为的精度转化为的精度转化为的精度,将精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度,将精度,将精度转化为的精度转化为的精度推,将精度转化为的精度推,将精度转化为的精度转化为的精度转化为的精度,将精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度转化为的精度的