The objectives of option hedging/trading extend beyond mere protection against downside risks, with a desire to seek gains also driving agent's strategies. In this study, we showcase the potential of robust risk-aware reinforcement learning (RL) in mitigating the risks associated with path-dependent financial derivatives. We accomplish this by leveraging a policy gradient approach that optimises robust risk-aware performance criteria. We specifically apply this methodology to the hedging of barrier options, and highlight how the optimal hedging strategy undergoes distortions as the agent moves from being risk-averse to risk-seeking. As well as how the agent robustifies their strategy. We further investigate the performance of the hedge when the data generating process (DGP) varies from the training DGP, and demonstrate that the robust strategies outperform the non-robust ones.
翻译:摘要:期权对冲/交易的目标不仅仅是为了保护下行风险,而且还希望通过寻求收益来推动策略的制定。在本研究中,我们展示了鲁棒风险稳健的强化学习(RL)在缓解路径依赖金融衍生品所涉及的风险方面的潜力。我们通过利用策略梯度方法来优化鲁棒风险稳健的绩效标准来实现这一点。我们特别将该方法应用于障碍期权的对冲,并强调当代理从风险规避转变为风险寻求时最佳对冲策略会发生扭曲的方式,以及代理如何使其策略鲁棒化的方法。我们进一步研究了当数据生成过程(DGP)与训练DGP不同时,对冲的表现,并证明鲁棒策略胜过非鲁棒策略。