This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm. The objective function of the optimization model is the weighted sum of the expectation and value at risk(VaR) of portfolio cumulative return. The proposed algorithm is based on actor-critic architecture, in which the main task of critical network is to learn the distribution of portfolio cumulative return using quantile regression, and actor network outputs the optimal portfolio weight by maximizing the objective function mentioned above. Meanwhile, we exploit a linear transformation function to realize asset short selling. Finally, A multi-process method is used, called Ape-x, to accelerate the speed of deep reinforcement learning training. To validate our proposed approach, we conduct backtesting for two representative portfolios and observe that the proposed model in this work is superior to the benchmark strategies.
翻译:本科学论文提出利用改进的深层强化学习算法的新组合优化模型。优化模型的客观功能是组合累积回报的预期值和风险值的加权总和。拟议算法以演员-评论结构为基础,其中关键网络的主要任务是利用四分位回归来学习组合累积回报的分布情况,而行为体网络则通过尽量扩大上述目标功能来产生最佳组合加权。与此同时,我们利用线性转换功能实现资产短售。最后,我们使用了称为Ape-x的多流程方法来加快深层强化学习培训的速度。为了验证我们的拟议方法,我们进行两个有代表性组合的回考,并观察到这项工作中的拟议模式优于基准战略。