Reinforcement learning (RL) techniques have shown great success in many challenging quantitative trading tasks, such as portfolio management and algorithmic trading. Especially, intraday trading is one of the most profitable and risky tasks because of the intraday behaviors of the financial market that reflect billions of rapidly fluctuating capitals. However, a vast majority of existing RL methods focus on the relatively low frequency trading scenarios (e.g., day-level) and fail to capture the fleeting intraday investment opportunities due to two major challenges: 1) how to effectively train profitable RL agents for intraday investment decision-making, which involves high-dimensional fine-grained action space; 2) how to learn meaningful multi-modality market representation to understand the intraday behaviors of the financial market at tick-level. Motivated by the efficient workflow of professional human intraday traders, we propose DeepScalper, a deep reinforcement learning framework for intraday trading to tackle the above challenges. Specifically, DeepScalper includes four components: 1) a dueling Q-network with action branching to deal with the large action space of intraday trading for efficient RL optimization; 2) a novel reward function with a hindsight bonus to encourage RL agents making trading decisions with a long-term horizon of the entire trading day; 3) an encoder-decoder architecture to learn multi-modality temporal market embedding, which incorporates both macro-level and micro-level market information; 4) a risk-aware auxiliary task to maintain a striking balance between maximizing profit and minimizing risk. Through extensive experiments on real-world market data spanning over three years on six financial futures, we demonstrate that DeepScalper significantly outperforms many state-of-the-art baselines in terms of four financial criteria.
翻译:强化学习(RL)技术在许多具有挑战性的量化交易任务(如投资组合管理和算法交易)中表现出了巨大的成功。 特别是,由于金融市场的日常行为反映了数十亿资本迅速波动,因此内部交易是最有利和最有风险的任务之一。然而,绝大多数现有的RL方法侧重于相对低频交易情景(如日级),未能抓住因两大挑战而需要利用的日常投资机会:1 如何有效培训盈利的RL代理商进行日常投资决策,这需要高度的精细行动空间;2 如何学习有意义的多模式市场代表性,以了解金融市场的日常行为,反映数十亿资本。 然而,由于专业的人类内部交易商的高效工作流程,我们提议为内部交易提供一个深度强化的学习框架,以应对上述挑战。 具体地说,DeepScalpererper包含四个组成部分:1) 直线的Q-网络,以行动分支方式处理内部交易的大型行动空间,以便提高RLRL优化的深度交易;2) 新的指标性决定,以高层次的市场结构为基础,体现一个跨时间框架的宏观规则;Sral-ral-ral-ral-ral-ral-ral-ral-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-I-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l