Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators - 专知论文

会员服务 ·

0

自助法/自举法 · 缩放 · 泛化理论 · 机器人 · Performer ·

2023 年 5 月 5 日

Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators

翻译：暂无翻译

Alexander Herzog,Kanishka Rao,Karol Hausman,Yao Lu,Paul Wohlhart,Mengyuan Yan,Jessica Lin,Montserrat Gonzalez Arenas,Ted Xiao,Daniel Kappler,Daniel Ho,Jarek Rettinghouse,Yevgen Chebotar,Kuang-Huei Lee,Keerthana Gopalakrishnan,Ryan Julian,Adrian Li,Chuyuan Kelly Fu,Bob Wei,Sangeetha Ramesh,Khem Holden,Kim Kleiven,David Rendleman,Sean Kirmani,Jeff Bingham,Jon Weisz,Ying Xu,Wenlong Lu,Matthew Bennice,Cody Fong,David Do,Jessica Lam,Yunfei Bai,Benjie Holson,Michael Quinlan,Noah Brown,Mrinal Kalakrishnan,Julian Ibarz,Peter Pastor,Sergey Levine

from arxiv, Published at Robotics: Science and Systems 2023

We describe a system for deep reinforcement learning of robotic manipulation skills applied to a large-scale real-world task: sorting recyclables and trash in office buildings. Real-world deployment of deep RL policies requires not only effective training algorithms, but the ability to bootstrap real-world training and enable broad generalization. To this end, our system combines scalable deep RL from real-world data with bootstrapping from training in simulation, and incorporates auxiliary inputs from existing computer vision systems as a way to boost generalization to novel objects, while retaining the benefits of end-to-end training. We analyze the tradeoffs of different design decisions in our system, and present a large-scale empirical validation that includes training on real-world data gathered over the course of 24 months of experimentation, across a fleet of 23 robots in three office buildings, with a total training set of 9527 hours of robotic experience. Our final validation also consists of 4800 evaluation trials across 240 waste station configurations, in order to evaluate in detail the impact of the design decisions in our system, the scaling effects of including more real-world data, and the performance of the method on novel objects. The projects website and videos can be found at \href{http://rl-at-scale.github.io}{rl-at-scale.github.io}.

翻译：暂无翻译

0

相关内容

自助法/自举法

自助法/自举法

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

不可错过！MILA最新《自监督表示学习》课程，附PPT与视频下载

不可错过！MILA最新《自监督表示学习》课程，附PPT与视频下载

专知会员服务

90+阅读 · 2020年12月21日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

MDM2介导的有丝分裂灾难- - -糖尿病肾病足细胞损伤的新机制

国家自然科学基金

0+阅读 · 2014年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

NiMnSnCo磁制冷材料快速凝固过程及微观结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

四足机器人变拓扑机构动态稳定步行机理及仿生机构综合研究

国家自然科学基金

0+阅读 · 2012年12月31日

BMSCs、EPCs和ATM组织工程尿道在犬体内的重塑效果和机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于刚柔接触动力学的打结器设计方法与成结机构研究

国家自然科学基金

0+阅读 · 2012年12月31日

PARP-1介导精氨酸酶II调节内皮功能与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于分子柔性和动力学信息的蛋白质折叠机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

PriorBand: Practical Hyperparameter Optimization in the Age of Deep Learning

Arxiv

0+阅读 · 2023年6月21日

Resilient Sparse Array Radar with the Aid of Deep Learning

Arxiv

0+阅读 · 2023年6月21日

A Responsive Framework for Research Portals Data using Semantic Web Technology

Arxiv

0+阅读 · 2023年6月20日

HomeRobot: Open-Vocabulary Mobile Manipulation

Arxiv

0+阅读 · 2023年6月20日

IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL

Arxiv

0+阅读 · 2023年6月20日

RM-PRT: Realistic Robotic Manipulation Simulator and Benchmark with Progressive Reasoning Tasks

Arxiv

0+阅读 · 2023年6月20日

Scalable Probabilistic Routes

Arxiv

0+阅读 · 2023年6月19日

Datasets and Benchmarks for Offline Safe Reinforcement Learning

Arxiv

0+阅读 · 2023年6月16日

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

Arxiv

0+阅读 · 2023年6月16日

Residual Q-Learning: Offline and Online Policy Customization without Value

Arxiv

0+阅读 · 2023年6月15日

VIP会员

文章信息

相关主题

自助法/自举法

相关VIP内容

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

不可错过！MILA最新《自监督表示学习》课程，附PPT与视频下载

不可错过！MILA最新《自监督表示学习》课程，附PPT与视频下载

专知会员服务

90+阅读 · 2020年12月21日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【EMNLP2025最佳论文】INFINI-GRAM MINI：基于 FM-Index 的互联网级精确 n-gram 搜索

【EMNLP2025教程】高效的大语言模型推理：算法、模型与系统，203页ppt

AI医疗行业研究报告：AI医疗前景广阔

【斯坦福博士论文】多模态基础模型：从科学理解到科学发现

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

PriorBand: Practical Hyperparameter Optimization in the Age of Deep Learning

Arxiv

0+阅读 · 2023年6月21日

Resilient Sparse Array Radar with the Aid of Deep Learning

Arxiv

0+阅读 · 2023年6月21日

A Responsive Framework for Research Portals Data using Semantic Web Technology

Arxiv

0+阅读 · 2023年6月20日

HomeRobot: Open-Vocabulary Mobile Manipulation

Arxiv

0+阅读 · 2023年6月20日

IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL

Arxiv

0+阅读 · 2023年6月20日

RM-PRT: Realistic Robotic Manipulation Simulator and Benchmark with Progressive Reasoning Tasks

Arxiv

0+阅读 · 2023年6月20日

Scalable Probabilistic Routes

Arxiv

0+阅读 · 2023年6月19日

Datasets and Benchmarks for Offline Safe Reinforcement Learning

Arxiv

0+阅读 · 2023年6月16日

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

Arxiv

0+阅读 · 2023年6月16日

Residual Q-Learning: Offline and Online Policy Customization without Value

Arxiv

0+阅读 · 2023年6月15日

相关基金

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

MDM2介导的有丝分裂灾难- - -糖尿病肾病足细胞损伤的新机制

国家自然科学基金

0+阅读 · 2014年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

NiMnSnCo磁制冷材料快速凝固过程及微观结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

四足机器人变拓扑机构动态稳定步行机理及仿生机构综合研究

国家自然科学基金

0+阅读 · 2012年12月31日

BMSCs、EPCs和ATM组织工程尿道在犬体内的重塑效果和机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于刚柔接触动力学的打结器设计方法与成结机构研究

国家自然科学基金

0+阅读 · 2012年12月31日

PARP-1介导精氨酸酶II调节内皮功能与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于分子柔性和动力学信息的蛋白质折叠机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员