Financial Portfolio Management is one of the most applicable problems in Reinforcement Learning (RL) by its sequential decision-making nature. Existing RL-based approaches, while inspiring, often lack scalability, reusability, or profundity of intake information to accommodate the ever-changing capital markets. In this paper, we design and develop MSPM, a novel Multi-agent Reinforcement learning-based system with a modularized and scalable architecture for portfolio management. MSPM involves two asynchronously updated units: Evolving Agent Module (EAM) and Strategic Agent Module (SAM). A self-sustained EAM produces signal-comprised information for a specific asset using heterogeneous data inputs, and each EAM possesses its reusability to have connections to multiple SAMs. A SAM is responsible for the assets reallocation of a portfolio using profound information from the EAMs connected. With the elaborate architecture and the multi-step condensation of the volatile market information, MSPM aims to provide a customizable, stable, and dedicated solution to portfolio management that existing approaches do not. We also tackle data-shortage issue of newly-listed stocks by transfer learning, and validate the necessity of EAM. Experiments on 8-year U.S. stock markets data prove the effectiveness of MSPM in profits accumulation by its outperformance over existing benchmarks.
翻译:金融组合管理是强化学习(RL)中最适用的问题之一,因为其依次决策性质,是强化学习(RL)的最适用问题之一; 现有基于RL的做法,虽然激励性(往往缺乏可缩放性、可再使用性或接收信息的先进性),以适应不断变化的资本市场; 在本文件中,我们设计和开发MSPM, 这是一种新型的多剂强化学习(MSPM)学习(MSPM)系统,它具有模块化和可扩缩的组合管理架构; MSPM涉及两个同步更新的单位: 动态代理模块(EAM)和战略代理模块(SAM); 自我维持的EAM(EAM)为使用不同数据投入的具体资产生成信号化信息,而每个EAM(EAM)拥有与多个不断变化的资本市场连接的可重复性。 一个SAM(MSAM)负责利用与组合管理(EAM)的深度信息进行资产再分配。 随着复杂的结构和波动市场信息的多步调,MSPM(多步调),旨在为组合管理提供一个可定制、稳定、专门的解决办法,而现有办法并不超越现有办法。