In high-dimensional time-series analysis, it is essential to have a set of key factors (namely, the style factors) that explain the change of the observed variable. For example, volatility modeling in finance relies on a set of risk factors, and climate change studies in climatology rely on a set of causal factors. The ideal low-dimensional style factors should balance significance (with high explanatory power) and stability (consistent, no significant fluctuations). However, previous supervised and unsupervised feature extraction methods can hardly address the tradeoff. In this paper, we propose Style Miner, a reinforcement learning method to generate style factors. We first formulate the problem as a Constrained Markov Decision Process with explanatory power as the return and stability as the constraint. Then, we design fine-grained immediate rewards and costs and use a Lagrangian heuristic to balance them adaptively. Experiments on real-world financial data sets show that Style Miner outperforms existing learning-based methods by a large margin and achieves a relatively 10% gain in R-squared explanatory power compared to the industry-renowned factors proposed by human experts.
翻译:在高维时间序列分析中,具有一组解释观察变量变化的关键因子(即样式因子)至关重要。例如,金融风险模型依赖于一组风险因子,气候变化研究依赖于一组因果因子。理想的低维样式因子应平衡意义(具有高解释能力)和稳定性(一致、无显著波动)。然而,以往的监督和无监督特征提取方法很难解决这种权衡。在本文中,我们提出了样式矿机,一种基于强化学习的方法来生成样式因子。我们首先将问题制定为具有解释能力的回报和稳定性约束的约束马尔科夫决策过程。然后,我们设计了细粒度的即时奖励和成本,并使用拉格朗日启发式算法适应地平衡它们。在真实的金融数据集上的实验证明,样式矿机优于现有的基于学习的方法,并相对于业内知名人类专家提出的因子获得了相对10%的R-squared解释能力收益。