Traditional ETF stock selection methods and reinforcement learning models such as the Asynchronous Advantage Actor-Critic (A3C) often suffer from high-dimensional feature spaces and overfitting when applied to complex financial markets. Moreover, static clustering algorithms fail to capture evolving market regimes, as the cluster with higher returns in one period may not remain optimal in the next. To address these limitations, this paper proposes Q-A3C2, a quantum-enhanced A3C framework that integrates time-series dynamic clustering. By embedding Variational Quantum Circuits (VQCs) into the policy network, Q-A3C2 enhances nonlinear feature representation and enables adaptive decision-making at the cluster level. Experimental results on the S and P 500 constituents show that Q-A3C2 achieves a cumulative return of 17.09%, outperforming the benchmark's 7.09%, demonstrating superior adaptability and exploration in dynamic financial environments.
翻译:传统的ETF选股方法以及异步优势演员-评论家(A3C)等强化学习模型在应用于复杂金融市场时,常面临高维特征空间和过拟合问题。此外,静态聚类算法难以捕捉不断演进的市场状态,因为某一时期收益较高的集群在下一阶段可能不再保持最优。为应对这些局限性,本文提出Q-A3C2——一种集成时间序列动态聚类的量子增强型A3C框架。通过将变分量子电路(VQCs)嵌入策略网络,Q-A3C2增强了非线性特征表征能力,并实现了集群层面的自适应决策。在标普500成分股上的实验结果表明,Q-A3C2累计收益率达17.09%,超越基准策略的7.09%,展现了其在动态金融环境中卓越的适应性与探索能力。