连续不断学习,《分立法:为优化执行而混合行动-空间强化学习》 (Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution) - 专知论文

会员服务 ·

0

Learning · Continuity · Agent · 优化器 · 离散化 ·

2022 年 7 月 22 日

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

翻译：连续不断学习,《分立法:为优化执行而混合行动-空间强化学习》

Feiyang Pan,Tongzhe Zhang,Ling Luo,Jia He,Shuoling Liu

Optimal execution is a sequential decision-making problem for cost-saving in algorithmic trading. Studies have found that reinforcement learning (RL) can help decide the order-splitting sizes. However, a problem remains unsolved: how to place limit orders at appropriate limit prices? The key challenge lies in the "continuous-discrete duality" of the action space. On the one hand, the continuous action space using percentage changes in prices is preferred for generalization. On the other hand, the trader eventually needs to choose limit prices discretely due to the existence of the tick size, which requires specialization for every single stock with different characteristics (e.g., the liquidity and the price range). So we need continuous control for generalization and discrete control for specialization. To this end, we propose a hybrid RL method to combine the advantages of both of them. We first use a continuous control agent to scope an action subset, then deploy a fine-grained agent to choose a specific limit price. Extensive experiments show that our method has higher sample efficiency and better training stability than existing RL algorithms and significantly outperforms previous learning-based methods for order execution.

翻译：优化执行是算法交易成本节约的顺序决策问题。研究发现, 强化学习( RL) 有助于决定顺序分割的大小。然而, 问题仍未解决: 如何在适当的限价价格下设置限制订单? 关键的挑战在于动作空间的“ 连续分明的双重性 ” 。一方面, 使用价格百分比变化的连续行动空间更适合一般化。另一方面, 交易商最终需要单独选择限制价格, 这是因为存在秒大小, 需要为具有不同特性( 如流动性和价格范围)的每一个单一股票进行专门化。因此, 我们需要持续控制一般化和独立控制来专门化。为此, 我们提出一种混合的 RL 方法, 将两者的优势结合起来。我们首先使用连续的控制代理来扩展一个行动子集, 然后部署一个精细的代理来选择特定的限价。广泛的实验显示, 我们的方法比现有的 RL 算法和以往的学习执行顺序方法要高, 。

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

随机约束下非齐次Markov跳变系统控制器设计

国家自然科学基金

0+阅读 · 2015年12月31日

(CexA1-x)2Ti2O7 (A=Y, Gd, Lu; x=0-1)的制备及离子束辐照效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

Hybrid加速结构的理论及预制研究

国家自然科学基金

0+阅读 · 2014年12月31日

滑坡强度定量预测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

框架理论及其在采样定理中的应用

国家自然科学基金

2+阅读 · 2012年12月31日

新型白光LED用玻璃陶瓷制备与发光性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

云制造环境下基于SOOA的动态服务资源集成与协调管理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

不确定环境下公交网络均衡分析与优化研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

Measuring Interventional Robustness in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月19日

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Arxiv

0+阅读 · 2022年9月19日

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月19日

DeepTOP: Deep Threshold-Optimal Policy for MDPs and RMABs

Arxiv

0+阅读 · 2022年9月18日

Reinforcement Learning for Self-exploration in Narrow Spaces

Arxiv

0+阅读 · 2022年9月17日

Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation

Arxiv

0+阅读 · 2022年9月17日

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Arxiv

0+阅读 · 2022年9月15日

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Arxiv

0+阅读 · 2022年9月15日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】基于物理的模拟

流匹配在生物学与生命科学中的应用综述

高质量数据集实践指南（1.0）

ICML 2025 关于语言模型机械可解释性的教程

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Measuring Interventional Robustness in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月19日

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Arxiv

0+阅读 · 2022年9月19日

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月19日

DeepTOP: Deep Threshold-Optimal Policy for MDPs and RMABs

Arxiv

0+阅读 · 2022年9月18日

Reinforcement Learning for Self-exploration in Narrow Spaces

Arxiv

0+阅读 · 2022年9月17日

Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation

Arxiv

0+阅读 · 2022年9月17日

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Arxiv

0+阅读 · 2022年9月15日

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Arxiv

0+阅读 · 2022年9月15日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

随机约束下非齐次Markov跳变系统控制器设计

国家自然科学基金

0+阅读 · 2015年12月31日

(CexA1-x)2Ti2O7 (A=Y, Gd, Lu; x=0-1)的制备及离子束辐照效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

Hybrid加速结构的理论及预制研究

国家自然科学基金

0+阅读 · 2014年12月31日

滑坡强度定量预测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

框架理论及其在采样定理中的应用

国家自然科学基金

2+阅读 · 2012年12月31日

新型白光LED用玻璃陶瓷制备与发光性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

云制造环境下基于SOOA的动态服务资源集成与协调管理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

不确定环境下公交网络均衡分析与优化研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员