Researchers are increasingly focusing on intelligent games as a hot research area.The article proposes an algorithm that combines the multi-attribute management and reinforcement learning methods, and that combined their effect on wargaming, it solves the problem of the agent's low rate of winning against specific rules and its inability to quickly converge during intelligent wargame training.At the same time, this paper studied a multi-attribute decision making and reinforcement learning algorithm in a wargame simulation environment, and obtained data on red and blue conflict.Calculate the weight of each attribute based on the intuitionistic fuzzy number weight calculations. Then determine the threat posed by each opponent's chess pieces.Using the red side reinforcement learning reward function, the AC framework is trained on the reward function, and an algorithm combining multi-attribute decision-making with reinforcement learning is obtained. A simulation experiment confirms that the algorithm of multi-attribute decision-making combined with reinforcement learning presented in this paper is significantly more intelligent than the pure reinforcement learning algorithm.By resolving the shortcomings of the agent's neural network, coupled with sparse rewards in large-map combat games, this robust algorithm effectively reduces the difficulties of convergence. It is also the first time in this field that an algorithm design for intelligent wargaming combines multi-attribute decision making with reinforcement learning.Attempt interdisciplinary cross-innovation in the academic field, like designing intelligent wargames and improving reinforcement learning algorithms.
翻译:文章提出一种算法,将多归性管理和强化学习方法结合起来,并结合这些算法对作战法的影响,它解决了代理人在特定规则下获胜率低的问题,并且无法在智能作战训练中迅速汇合。 与此同时,本文研究了在战争游戏模拟环境中的多归性决策和强化学习算法,并获得了关于红和蓝冲突的强化数据。根据直觉式模糊数字重量计算,计算出每个属性的重量。然后确定每个对手的棋棋子构成的威胁。 红色侧强化学习奖赏功能,对AC框架进行了有关奖励功能的培训,以及将多归性决策与强化学习相结合的算法。一个模拟实验证实,多归性决策的算法,加上本文提出的强化学习,比纯加固性学习算法要聪明得多。 解决代理人神经网络的缺陷,加上大规模战斗游戏中的微变异性报酬,这个精细的计算法, 也有效地将智能战场的学习困难合并起来。