This paper investigates the problem of impact-time-control and proposes a learning-based computational guidance algorithm to solve this problem. The proposed guidance algorithm is developed based on a general prediction-correction concept: the exact time-to-go under proportional navigation guidance with realistic aerodynamic characteristics is estimated by a deep neural network and a biased command to nullify the impact time error is developed by utilizing the emerging reinforcement learning techniques. The deep neural network is augmented into the reinforcement learning block to resolve the issue of sparse reward that has been observed in typical reinforcement learning formulation. Extensive numerical simulations are conducted to support the proposed algorithm.
翻译:本文件调查了撞击时间控制问题,并提出了解决这一问题的基于学习的计算指导算法。拟议的指导算法是根据一般预测-校正概念拟订的:由深神经网络估计在具有现实空气动力特性的成比例导航指导下,精确的飞行时间,通过利用新兴的强化学习技术,形成消除撞击时间错误的偏差指令。深神经网络被扩大为强化学习块,以解决典型强化学习公式中观察到的微弱报酬问题。进行了广泛的数字模拟,以支持拟议的算法。