Due to the rapid dynamics and a mass of uncertainties in the quantitative markets, the issue of how to take appropriate actions to make profits in stock trading remains a challenging one. Reinforcement learning (RL), as a reward-oriented approach for optimal control, has emerged as a promising method to tackle this strategic decision-making problem in such a complex financial scenario. In this paper, we integrated two prior financial trading strategies named constant proportion portfolio insurance (CPPI) and time-invariant portfolio protection (TIPP) into multi-agent deep deterministic policy gradient (MADDPG) and proposed two specifically designed multi-agent RL (MARL) methods: CPPI-MADDPG and TIPP-MADDPG for investigating strategic trading in quantitative markets. Afterward, we selected 100 different shares in the real financial market to test these specifically proposed approaches. The experiment results show that CPPI-MADDPG and TIPP-MADDPG approaches generally outperform the conventional ones.
翻译:由于量化市场的快速动态和大量不确定性,如何采取适当行动在股票交易中获利仍然是一个具有挑战性的问题。强化学习作为一种基于奖励的最优控制方法,在这种复杂的金融场景中已经成为解决战略决策问题的一个有前途的方法。在本文中,我们将两种先前的金融交易策略(即恒定比例组合保险和时间不变组合保护)整合到多智能体深度确定性策略梯度 (MADDPG) 中,并提出了两种特别设计的多智能体强化学习方法(即CPPI-MADDPG和TIPP-MADDPG)来研究量化市场中的战略交易。随后,我们选择了实际金融市场中的100种不同的股票来测试这些特别提出的方法。实验结果表明,CPPI-MADDPG和TIPP-MADDPG方法通常优于传统方法。