The objectives of option hedging/trading extend beyond mere protection against downside risks, with a desire to seek gains also driving agent's strategies. In this study, we showcase the potential of robust risk-aware reinforcement learning (RL) in mitigating the risks associated with path-dependent financial derivatives. We accomplish this by leveraging the Jaimungal, Pesenti, Wang, Tatsat (2022) and their policy gradient approach, which optimises robust risk-aware performance criteria. We specifically apply this methodology to the hedging of barrier options, and highlight how the optimal hedging strategy undergoes distortions as the agent moves from being risk-averse to risk-seeking. As well as how the agent robustifies their strategy. We further investigate the performance of the hedge when the data generating process (DGP) varies from the training DGP, and demonstrate that the robust strategies outperform the non-robust ones.
翻译:期权对冲/交易的目标不仅仅是对抗下行风险,而且还有追求收益的愿望,这也推动着代理商的策略。在本研究中,我们展示了稳健的风险感知强化学习(RL)在减轻与路径依赖性金融衍生品相关的风险方面的潜力。我们通过利用Jaimungal、Pesenti、Wang、Tatsat (2022)及其策略梯度方法在对稳健的风险感知业绩标准进行优化的同时实现了这一目标。我们具体将此方法应用于障碍期权对冲,并强调随着代理从风险规避到风险追求的移动,最优对冲策略会发生扭曲,以及代理如何稳健化其策略。我们进一步研究了数据生成过程(DGP)与训练DGP不同时,对冲的绩效,并证明了稳健策略优于非稳健策略。