解决魔方需要的就是你的自译自审 (Self-Supervision is All You Need for Solving Rubik's Cube) - 专知论文

会员服务 ·

0

DNN · 优化器 · 学成 · SimPLe · 稳健性 ·

2021 年 10 月 6 日

Self-Supervision is All You Need for Solving Rubik's Cube

翻译：解决魔方需要的就是你的自译自审

from arxiv, 8 pages, 5 figures. This update v2 restructures the paper, includes additional theoretical background, and adjusts figures

While combinatorial problems are of great academic and practical importance, previous approaches like explicit heuristics and reinforcement learning have been complex and costly. To address this, we developed a simple and robust method to train a Deep Neural Network (DNN) through self-supervised learning for solving a goal-predefined combinatorial problem. Assuming that more optimal moves occur more frequently as a path of random moves connecting two problem states, the DNN can approximate an optimal solver by learning to predict the last move of a random scramble based on the problem state. Tested on 1,000 scrambled Rubik's Cube instances, a Transformer-based model could solve all of them near-optimally using a breadth-first search; with a maximum breadth of $10^3$, the mean solution length was $20.5$ moves. The proposed method may apply to other goal-predefined combinatorial problems, though it has a few constraints.

翻译：虽然组合问题具有极大的学术和实际重要性,但以往的清晰的超自然学和强化学习等方法一直复杂而昂贵。为了解决这个问题,我们开发了一种简单而有力的方法,通过自我监督的学习来培训深神经网络(DNN ), 以解决一个预定目标的组合问题。假设更优化的动作更频繁地作为连接两个问题国家的随机动作路径出现, DNN 可以通过学习预测基于问题状态的随机动作的最后动作来接近一个最佳的解决方案。在1,000个拼凑的Rubik Cube 实例中测试了一个基于变异器的模型,可以使用宽度第一搜索近乎最理想地解决所有这些案例;在最大宽度为 10 3 美元的情况下,平均解决方案长度为 20.5 美元的移动。拟议的方法可以适用于其他预定目标的组合问题,尽管它有一些限制。

0

相关内容

DNN

【斯坦福Jiaxuan You】图学习在金融网络中的应用，24页ppt

【斯坦福Jiaxuan You】图学习在金融网络中的应用，24页ppt

专知会员服务

45+阅读 · 2021年9月19日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

一文读懂「Attention is All You Need」| 附代码实现

一文读懂「Attention is All You Need」| 附代码实现

PaperWeekly

37+阅读 · 2018年1月10日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

MDPFuzzer: Finding Crash-Triggering State Sequences in Models Solving the Markov Decision Process

Arxiv

0+阅读 · 2021年12月12日

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

Arxiv

0+阅读 · 2021年12月9日

Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning

Arxiv

0+阅读 · 2021年12月6日

A Novel Approach to Solving Goal-Achieving Problems for Board Games

Arxiv

0+阅读 · 2021年12月5日

Towards Tractable Optimism in Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2021年12月3日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

Arxiv

5+阅读 · 2018年12月30日

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Arxiv

9+阅读 · 2018年11月25日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

相关VIP内容

【斯坦福Jiaxuan You】图学习在金融网络中的应用，24页ppt

【斯坦福Jiaxuan You】图学习在金融网络中的应用，24页ppt

专知会员服务

45+阅读 · 2021年9月19日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

一文读懂「Attention is All You Need」| 附代码实现

一文读懂「Attention is All You Need」| 附代码实现

PaperWeekly

37+阅读 · 2018年1月10日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

MDPFuzzer: Finding Crash-Triggering State Sequences in Models Solving the Markov Decision Process

Arxiv

0+阅读 · 2021年12月12日

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

Arxiv

0+阅读 · 2021年12月9日

Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning

Arxiv

0+阅读 · 2021年12月6日

A Novel Approach to Solving Goal-Achieving Problems for Board Games

Arxiv

0+阅读 · 2021年12月5日

Towards Tractable Optimism in Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2021年12月3日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

Arxiv

5+阅读 · 2018年12月30日

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Arxiv

9+阅读 · 2018年11月25日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

微信扫码咨询专知VIP会员