Solving portfolio management problems using deep reinforcement learning has been getting much attention in finance for a few years. We have proposed a new method using experts signals and historical price data to feed into our reinforcement learning framework. Although experts signals have been used in previous works in the field of finance, as far as we know, it is the first time this method, in tandem with deep RL, is used to solve the financial portfolio management problem. Our proposed framework consists of a convolutional network for aggregating signals, another convolutional network for historical price data, and a vanilla network. We used the Proximal Policy Optimization algorithm as the agent to process the reward and take action in the environment. The results suggested that, on average, our framework could gain 90 percent of the profit earned by the best expert.
翻译:几年来,利用深层强化学习解决投资组合管理问题在金融领域一直受到极大关注。我们提出了使用专家信号和历史价格数据作为我们强化学习框架的新方法。尽管专家信号在金融领域以往工作中已经使用过,但据我们所知,这是第一次在深层RL的配合下,利用这一方法解决金融投资组合管理问题。我们提议的框架包括一个汇集信号的革命网络、另一个历史价格数据革命网络和香草网络。我们用“最优化政策算法”作为处理奖赏和在环境中采取行动的代理。结果显示,平均而言,我们的框架可以获得最佳专家所赚取利润的90%。