Financial portfolio management is one of the most applicable problems in reinforcement learning (RL) owing to its sequential decision-making nature. Existing RL-based approaches, while inspiring, often lack scalability, reusability, or profundity of intake information to accommodate the ever-changing capital markets. In this paper, we propose MSPM, a modularized and scalable, multi-agent RL-based system for financial portfolio management. MSPM involves two asynchronously updated units: an Evolving Agent Module (EAM) and Strategic Agent Module (SAM). A self-sustained EAM produces signal-comprised information for a specific asset using heterogeneous data inputs, and each EAM employs its reusability to have connections to multiple SAMs. An SAM is responsible for asset reallocation in a portfolio using profound information from the connected EAMs. With the elaborate architecture and the multi-step condensation of volatile market information, MSPM aims to provide a customizable, stable, and dedicated solution to portfolio management, unlike existing approaches. We also tackle the data-shortage issue of newly-listed stocks by transfer learning, and validate the indispensability of EAM with four different portfolios. Experiments on 8-year U.S. stock market data prove the effectiveness of MSPM in profit accumulation, by its outperformance over existing benchmarks.
翻译:现有基于RL的方法,虽然激励(往往缺乏可缩放性、可再使用性或接收信息的先进性,以适应不断变化的资本市场),但每个EAM采用其可重复性,以便与多个SAM连接起来。在本文件中,我们提议采用MSPM(一个模块化和可缩放的多试剂RL系统),用于金融组合管理。MSPM涉及两个同步更新的单位:一个动态代理模块(EAM)和战略代理模块(SAM)。一个自我维持的EAM(EAM)利用各种数据投入为特定资产提供信号兼容的信息,而每个EAM(EAM)则利用其可重复性与多个SAM连接起来。一个SAM(一个SAM)负责在组合中进行资产再分配,使用来自链接的EAM(EAM)的深刻信息。随着市场信息结构的完善和波动信息的多步调,MSP(多步调)旨在为组合管理提供一个可定制、稳定、专门的解决方案。我们与现有方法不同,还处理新上市股票的短期数据转换问题,通过转移、测试和验证现有IMS(ERM)资产积累基准。