We present a new financial framework where two families of RL-based agents representing the Liquidity Providers and Liquidity Takers learn simultaneously to satisfy their objective. Thanks to a parametrized reward formulation and the use of Deep RL, each group learns a shared policy able to generalize and interpolate over a wide range of behaviors. This is a step towards a fully RL-based market simulator replicating complex market conditions particularly suited to study the dynamics of the financial market under various scenarios.
翻译:我们提出了一个新的金融框架,代表流动性提供者和流动性接受者的两个以RL为主的代理群体可以同时学习,以达到他们的目标。 由于采取了平衡奖励的提法和使用Deep RL,每个群体都学会了一种共同的政策,能够对广泛的行为进行概括和相互调和。 这是朝着完全以RL为主的市场模拟器的一步,该模拟器可以复制特别适合研究各种情景下的金融市场动态的复杂市场条件。