Traditional risk factors like beta, size/value, and momentum often lag behind market dynamics in measuring and predicting stock return volatility. Statistical models like PCA and factor analysis fail to capture hidden nonlinear relationships. Genetic programming (GP) can identify nonlinear factors but often lacks mechanisms for evaluating factor quality, and the resulting formulas are complex. To address these challenges, we propose a Hierarchical Proximal Policy Optimization (HPPO) framework for automated factor generation and evaluation. HPPO uses two PPO models: a high-level policy assigns weights to stock features, and a low-level policy identifies latent nonlinear relationships. The Pearson correlation between generated factors and return volatility serves as the reward signal. Transfer learning pre-trains the high-level policy on large-scale historical data, fine-tuning it with the latest data to adapt to new features and shifts. Experiments show the HPPO-TO algorithm achieves a 25\% excess return in HFT markets across China (CSI 300/800), India (Nifty 100), and the US (S\&P 500). Code and data are available at https://github.com/wencyxu/HRL-HF_risk_factor_set.
翻译:暂无翻译