3D Bin 在线 3D Bin 包装强化学习用缓冲解决方案 (Online 3D Bin Packing Reinforcement Learning Solution with Buffer) - 专知论文

会员服务 ·

0

Packing · Performer · Learning · Buffer（公司） · 强化学习 ·

2022 年 8 月 15 日

Online 3D Bin Packing Reinforcement Learning Solution with Buffer

翻译：3D Bin 在线 3D Bin 包装强化学习用缓冲解决方案

Aaron Valero Puche,Sukhan Lee

from arxiv, Accepted in IROS 2022

The 3D Bin Packing Problem (3D-BPP) is one of the most demanded yet challenging problems in industry, where an agent must pack variable size items delivered in sequence into a finite bin with the aim to maximize the space utilization. It represents a strongly NP-Hard optimization problem such that no solution has been offered to date with high performance in space utilization. In this paper, we present a new reinforcement learning (RL) framework for a 3D-BPP solution for improving performance. First, a buffer is introduced to allow multi-item action selection. By increasing the degree of freedom in action selection, a more complex policy that results in better packing performance can be derived. Second, we propose an agnostic data augmentation strategy that exploits both bin item symmetries for improving sample efficiency. Third, we implement a model-based RL method adapted from the popular algorithm AlphaGo, which has shown superhuman performance in zero-sum games. Our adaptation is capable of working in single-player and score based environments. In spite of the fact that AlphaGo versions are known to be computationally heavy, we manage to train the proposed framework with a single thread and GPU, while obtaining a solution that outperforms the state-of-the-art results in space utilization.

翻译：3D Bin包装问题 (3D-BPP) 是行业中最需要但最具挑战性的问题之一, 代理商必须把按顺序交付的可变大小项目包装成一个有限的垃圾箱, 以便最大限度地实现空间利用。它是一个强烈的NP- Hard优化问题, 至今尚未提出空间利用性能高的解决方案。在本文中, 我们为3D- BPP改进性能解决方案提出了一个新的强化学习框架( RL) 。首先, 引入一个缓冲, 允许多项目动作选择。通过提高行动选择的自由度, 可以产生一个更复杂的政策, 从而导致更好的包装性能。第二, 我们提出一个不可知性的数据增强战略, 利用两个宾项组合来提高样本效率。第三, 我们实施了一个基于模型的 RLEGO 方法, 从流行的算法中得到了超人性能, 在零和游戏中表现出超人性能。我们的适应能够在单人性环境里工作, 并且基于分分数。尽管人们知道阿尔法 Go 版本在计算上是重的, 我们设法用一个单一的版本, 但是我们设法用了一个单一的公式和GPU 格式来训练一个单一的版本, 。

0

相关内容

Packing

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

氮化镓/硅多界面纳米异质结新型太阳能电池研究

国家自然科学基金

0+阅读 · 2017年12月31日

基于金属有机框架结构的光电传感器研究

国家自然科学基金

0+阅读 · 2016年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

AFM纳米颗粒镶嵌于FM基体交换偏置效应及其机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

PRMT2对乳腺癌细胞增殖和耐药的影响及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

托卡马克边界等离子体输运的三维模拟

国家自然科学基金

0+阅读 · 2011年12月31日

质量驱动的无线多媒体传输的跨层方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

铁磁基态锰氧化物中3d与4f电子磁性相互作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

A deep learning approach to the probabilistic numerical solution of path-dependent partial differential equations

Arxiv

0+阅读 · 2022年10月4日

Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control

Arxiv

0+阅读 · 2022年10月4日

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Arxiv

0+阅读 · 2022年10月4日

Policy Gradient for Reinforcement Learning with General Utilities

Arxiv

0+阅读 · 2022年10月3日

Viewpoint Planning based on Shape Completion for Fruit Mapping and Reconstruction

Arxiv

0+阅读 · 2022年9月30日

S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年9月30日

Risk Control for Online Learning Models

Arxiv

0+阅读 · 2022年9月30日

Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月29日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

Buffer（公司）

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

A deep learning approach to the probabilistic numerical solution of path-dependent partial differential equations

Arxiv

0+阅读 · 2022年10月4日

Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control

Arxiv

0+阅读 · 2022年10月4日

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Arxiv

0+阅读 · 2022年10月4日

Policy Gradient for Reinforcement Learning with General Utilities

Arxiv

0+阅读 · 2022年10月3日

Viewpoint Planning based on Shape Completion for Fruit Mapping and Reconstruction

Arxiv

0+阅读 · 2022年9月30日

S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年9月30日

Risk Control for Online Learning Models

Arxiv

0+阅读 · 2022年9月30日

Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

Arxiv

0+阅读 · 2022年9月29日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

氮化镓/硅多界面纳米异质结新型太阳能电池研究

国家自然科学基金

0+阅读 · 2017年12月31日

基于金属有机框架结构的光电传感器研究

国家自然科学基金

0+阅读 · 2016年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

AFM纳米颗粒镶嵌于FM基体交换偏置效应及其机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

PRMT2对乳腺癌细胞增殖和耐药的影响及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

托卡马克边界等离子体输运的三维模拟

国家自然科学基金

0+阅读 · 2011年12月31日

质量驱动的无线多媒体传输的跨层方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

铁磁基态锰氧化物中3d与4f电子磁性相互作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员