The financial markets, which involve more than $90 trillion market capitals, attract the attention of innumerable investors around the world. Recently, reinforcement learning in financial markets (FinRL) has emerged as a promising direction to train agents for making profitable investment decisions. However, the evaluation of most FinRL methods only focuses on profit-related measures and ignores many critical axes, which are far from satisfactory for financial practitioners to deploy these methods into real-world financial markets. Therefore, we introduce PRUDEX-Compass, which has 6 axes, i.e., Profitability, Risk-control, Universality, Diversity, rEliability, and eXplainability, with a total of 17 measures for a systematic evaluation. Specifically, i) we propose AlphaMix+ as a strong FinRL baseline, which leverages mixture-of-experts (MoE) and risk-sensitive approaches to make diversified risk-aware investment decisions, ii) we evaluate 8 FinRL methods in 4 long-term real-world datasets of influential financial markets to demonstrate the usage of our PRUDEX-Compass, iii) PRUDEX-Compass together with 4 real-world datasets, standard implementation of 8 FinRL methods and a portfolio management environment is released as public resources to facilitate the design and comparison of new FinRL methods. We hope that PRUDEX-Compass can not only shed light on future FinRL research to prevent untrustworthy results from stagnating FinRL into successful industry deployment but also provide a new challenging algorithm evaluation scenario for the reinforcement learning (RL) community.
翻译:金融市场涉及90万亿美元以上的市场资本,吸引了全世界无数投资者的注意。最近,金融市场的强化学习(FinRL)已成为一个有希望的方向,可以培训代理人作出有利可图的投资决策。然而,对大部分FinRL方法的评价仅侧重于与利润有关的措施,忽视了许多关键轴心,而金融从业者远不能令人满意地将这些方法运用到真实世界金融市场。因此,我们引入了PRUDEX-Compass,它有6个轴心,即:利润、风险控制、普遍性、多样性、可靠性和eXlity,共有17项系统评估措施。 具体地说,我们提议将AlphaMix+作为FinRL的成功基线,利用专家混合(MoE)和对风险敏感的方法来作出多样化的风险意识投资决定。 因此,我们只对具有影响力的金融市场的4个长期真实世界数据集中的8个FinRL方法进行了评估,以展示我们使用PRUEX-Complex、refility 和FinUL公司未来标准设计环境评估方法,我们只能用FUDR-rual-commas 4号数据库来提供新的风险。</s>