深入库存管理 (Deep Inventory Management) - 专知论文

会员服务 ·

0

Learning · 控制器 · 有向 · 周期的 · 强化学习 ·

2022 年 11 月 25 日

Deep Inventory Management

翻译：深入库存管理

Dhruv Madeka,Kari Torkkola,Carson Eisenach,Anna Luo,Dean P. Foster,Sham M. Kakade

This work provides a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching. While this dynamic program has historically been considered intractable, our results show that several policy learning approaches are competitive with or outperform classical methods. In order to train these algorithms, we develop novel techniques to convert historical data into a simulator. On the theoretical side, we present learnability results on a subclass of inventory control problems, where we provide a provable reduction of the reinforcement learning problem to that of supervised learning. On the algorithmic side, we present a model-based reinforcement learning procedure (Direct Backprop) to solve the periodic review inventory control problem by constructing a differentiable simulator. Under a variety of metrics Direct Backprop outperforms model-free RL and newsvendor baselines, in both simulations and real-world deployments.

翻译：这项工作提供了一种深强化学习方法,以解决定期审查库存控制系统的方法,该系统的供应商周转时间、损失的销售、相关需求和价格匹配都具有随机性。虽然这一动态程序历来被认为是棘手的,但我们的结果表明,一些政策学习方法与传统方法具有竞争力或优于典型方法。为了培训这些算法,我们开发了将历史数据转换为模拟器的新技术。在理论方面,我们展示了库存控制问题亚类的可学习性结果,从中可以看出强化学习问题与受监督的学习问题之间的可证实的减少。在算法方面,我们展示了一种基于模型的强化学习程序(隐形背面程序),通过建立一个不同的模拟器来解决定期审查库存控制问题。在各种标准直接反向偏差模型的模型RL和新闻供应商基线下,在模拟和现实世界部署中都是如此。

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

基于重要性采样的并行离策略强化学习方法研究

国家自然科学基金

23+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于纳米发电机的自驱动MEMS/NEMS机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于深度学习的飞行器故障不确定性评估与预测研究

国家自然科学基金

4+阅读 · 2014年12月31日

基于数据粒度模型的3DGIS实时可视化数据组织管理与绘制优化方法

国家自然科学基金

1+阅读 · 2012年12月31日

基于Mercer核的非负矩阵分解关键问题研究及其在人脸识别中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Multi-Agent的应急状态下协同供应链数据集成研究

国家自然科学基金

0+阅读 · 2012年12月31日

光相干层析成像研究血液凝固过程中的光学性质动态变化及特征参数

国家自然科学基金

0+阅读 · 2011年12月31日

非线性软测量系统递推量子随机滤波方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

Smoothed Online Learning for Prediction in Piecewise Affine Systems

Arxiv

0+阅读 · 2023年1月26日

Deep Laplacian-based Options for Temporally-Extended Exploration

Arxiv

0+阅读 · 2023年1月26日

Double Deep Reinforcement Learning Techniques for Low Dimensional Sensing Mapless Navigation of Terrestrial Mobile Robots

Arxiv

0+阅读 · 2023年1月26日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Forecasting: theory and practice

Arxiv

57+阅读 · 2022年1月5日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

50+阅读 · 2020年12月20日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

79+阅读 · 2020年1月19日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Smoothed Online Learning for Prediction in Piecewise Affine Systems

Arxiv

0+阅读 · 2023年1月26日

Deep Laplacian-based Options for Temporally-Extended Exploration

Arxiv

0+阅读 · 2023年1月26日

Double Deep Reinforcement Learning Techniques for Low Dimensional Sensing Mapless Navigation of Terrestrial Mobile Robots

Arxiv

0+阅读 · 2023年1月26日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Forecasting: theory and practice

Arxiv

57+阅读 · 2022年1月5日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

50+阅读 · 2020年12月20日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

79+阅读 · 2020年1月19日

相关基金

基于重要性采样的并行离策略强化学习方法研究

国家自然科学基金

23+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于纳米发电机的自驱动MEMS/NEMS机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于深度学习的飞行器故障不确定性评估与预测研究

国家自然科学基金

4+阅读 · 2014年12月31日

基于数据粒度模型的3DGIS实时可视化数据组织管理与绘制优化方法

国家自然科学基金

1+阅读 · 2012年12月31日

基于Mercer核的非负矩阵分解关键问题研究及其在人脸识别中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Multi-Agent的应急状态下协同供应链数据集成研究

国家自然科学基金

0+阅读 · 2012年12月31日

光相干层析成像研究血液凝固过程中的光学性质动态变化及特征参数

国家自然科学基金

0+阅读 · 2011年12月31日

非线性软测量系统递推量子随机滤波方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员