The study of electromagnetic detection satellite scheduling problem (EDSSP) has attracted attention due to the detection requirements for a large number of targets. This paper proposes a mixed-integer programming model for the EDSSP problem and an evolutionary algorithm framework based on reinforcement learning (RL-EA). Numerous factors that affect electromagnetic detection are considered in the model, such as detection mode, bandwidth, and other factors. The evolutionary algorithm framework based on reinforcement learning uses the Q-learning framework, and each individual in the population is regarded as an agent. Based on the proposed framework, a Q-learning-based genetic algorithm(QGA) is designed. Q-learning is used to guide the population search process by choosing variation operators. In the algorithm, we design a reward function to update the Q value. According to the problem characteristics, a new combination of <state, action> is proposed. The QGA also uses an elite individual retention strategy to improve search performance. After that, a task time window selection algorithm is proposed To evaluate the performance of population evolution. Various scales experiments are used to examine the planning effect of the proposed algorithm. Through the experimental verification of multiple instances, it can be seen that the QGA can solve the EDSSP problem effectively. Compared with the state-of-the-art algorithms, the QGA algorithm performs better in several aspects.
翻译:电磁探测卫星调度问题研究(EDSSP)引起了人们的注意,因为对大量目标的探测要求,因此,对电磁探测卫星调度问题进行了研究(EDSSP),本文件提议为EDSSP问题建立一个混合整数编程模型,并基于强化学习学习(RL-EA),并提出了基于强化学习模式、带宽和其他因素等影响电磁探测的许多因素。基于强化学习的演进算法框架使用Q-学习框架,并将人口中的每个人视为一种代理。根据拟议框架,设计了一个基于Q-学习的遗传算法(QGA)。Q-学习用于通过选择变异操作者来指导人口搜索过程。在算法中,我们设计了一个奖励功能来更新Q值。根据问题特性,提出了一种 < state,行动 > 的新组合。QGA还使用精英个人留用战略来提高搜索绩效。此后,提议了一个任务窗口选择算法来评估人口演化的绩效。利用各种规模的实验来审查拟议的算法的规划效果。在变异操作中,通过实验性化的Q-S-SAL-SAR演算法,可以有效地解决QASA的多种例子。