车辆运行问题深入政策动态规划 (Deep Policy Dynamic Programming for Vehicle Routing Problems) - 专知论文

会员服务 ·

0

策略改进 · 学成 · Performer · contrastive · 状态空间 ·

2021 年 12 月 2 日

Deep Policy Dynamic Programming for Vehicle Routing Problems

翻译：车辆运行问题深入政策动态规划

Wouter Kool,Herke van Hoof,Joaquim Gromicho,Max Welling

from arxiv, 21 pages

Routing problems are a class of combinatorial problems with many practical applications. Recently, end-to-end deep learning methods have been proposed to learn approximate solution heuristics for such problems. In contrast, classical dynamic programming (DP) algorithms guarantee optimal solutions, but scale badly with the problem size. We propose Deep Policy Dynamic Programming (DPDP), which aims to combine the strengths of learned neural heuristics with those of DP algorithms. DPDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions. We evaluate our framework on the travelling salesman problem (TSP), the vehicle routing problem (VRP) and TSP with time windows (TSPTW) and show that the neural policy improves the performance of (restricted) DP algorithms, making them competitive to strong alternatives such as LKH, while also outperforming most other 'neural approaches' for solving TSPs, VRPs and TSPTWs with 100 nodes.

翻译：例如,最近,提出了端到端的深层次学习方法,以学习近似的解决办法,以了解这类问题的解决办法。相比之下,典型的动态编程(DP)算法保证了最佳的解决方案,但规模却与问题大小相比差强人意。我们提出了深政策动态编程(DPDP),其目的是将所学神经超常症的强项与DP算法的强项结合起来。DPDP利用由深层神经网络产生的政策,优先考虑并限制DP州空间,该政策经过培训,可以预测从示例解决方案的边缘。我们用时间窗口评估了我们关于流动销售商问题(TSP)、车辆路由问题(VRP)和TSP的框架,并表明神经政策改善了(限制的)DP算法的性能,使其与诸如LKH等强的替代法具有竞争力,同时在用100个节点解决TSP、VRPs和TPTPWs的其他“神经方法”中也比其他大多数“神经方法”好。

0

相关内容

策略改进

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

【因果人工智能系统】106页ppt，Causal AI for Systems

专知会员服务

97+阅读 · 2021年8月28日

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

A $J$-Symmetric Quasi-Newton Method for Minimax Problems

A $J$-Symmetric Quasi-Newton Method for Minimax Problems

Arxiv

0+阅读 · 2022年2月4日

Policy Optimization for Constrained MDPs with Provable Fast Global Convergence

Arxiv

0+阅读 · 2022年2月3日

Algorithms for Efficiently Learning Low-Rank Neural Networks

Algorithms for Efficiently Learning Low-Rank Neural Networks

Arxiv

0+阅读 · 2022年2月3日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Spectral Network Embedding: A Fast and Scalable Method via Sparsity

Arxiv

3+阅读 · 2018年6月7日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

8+阅读 · 2018年5月21日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arxiv

56+阅读 · 2018年2月20日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

【因果人工智能系统】106页ppt，Causal AI for Systems

专知会员服务

97+阅读 · 2021年8月28日

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

A $J$-Symmetric Quasi-Newton Method for Minimax Problems

A $J$-Symmetric Quasi-Newton Method for Minimax Problems

Arxiv

0+阅读 · 2022年2月4日

Policy Optimization for Constrained MDPs with Provable Fast Global Convergence

Arxiv

0+阅读 · 2022年2月3日

Algorithms for Efficiently Learning Low-Rank Neural Networks

Algorithms for Efficiently Learning Low-Rank Neural Networks

Arxiv

0+阅读 · 2022年2月3日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Spectral Network Embedding: A Fast and Scalable Method via Sparsity

Arxiv

3+阅读 · 2018年6月7日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

8+阅读 · 2018年5月21日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arxiv

56+阅读 · 2018年2月20日

微信扫码咨询专知VIP会员