学习教学：面向多样化地形人形机器人行走的样本高效特权学习方法 (Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains) - 专知论文

会员服务 ·

0

Learning · 样本 · 多样性 · 机器人 · 变换 ·

Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains

翻译：学习教学：面向多样化地形人形机器人行走的样本高效特权学习方法

Feiyang Wu,Xavier Nal,Jaehwi Jang,Wei Zhu,Zhaoyuan Gu,Anqi Wu,Ye Zhao

from arxiv, Accepted to IEEE Robotics and Automation Letters (RA-L)

Humanoid robots promise transformative capabilities for industrial and service applications. While recent advances in Reinforcement Learning (RL) yield impressive results in locomotion, manipulation, and navigation, the proposed methods typically require enormous simulation samples to account for real-world variability. This work proposes a novel one-stage training framework-Learn to Teach (L2T)-which unifies teacher and student policy learning. Our approach recycles simulator samples and synchronizes the learning trajectories through shared dynamics, significantly reducing sample complexities and training time while achieving state-of-the-art performance. Furthermore, we validate the RL variant (L2T-RL) through extensive simulations and hardware tests on the Digit robot, demonstrating zero-shot sim-to-real transfer and robust performance over 12+ challenging terrains without depth estimation modules.

翻译：人形机器人在工业与服务应用领域展现出变革性潜力。尽管强化学习（RL）在运动控制、操作与导航方面取得了显著进展，现有方法通常需要海量仿真样本以应对现实世界的复杂性。本研究提出一种新颖的单阶段训练框架——学习教学（L2T），该框架将教师策略与学生策略学习进行统一。我们的方法通过共享动力学机制实现仿真样本的循环利用与学习轨迹的同步优化，在保持最先进性能的同时显著降低了样本复杂度与训练时间。此外，我们通过Digit机器人上的大量仿真实验与硬件测试验证了强化学习变体（L2T-RL），证明了其在无需深度估计模块的情况下，能够实现零样本仿真到现实的迁移，并在12种以上复杂地形中保持鲁棒性能。

0

相关内容

Learning

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Balancing Specialization and Centralization: A Multi-Agent Reinforcement Learning Benchmark for Sequential Industrial Control

Arxiv

0+阅读 · 10月23日

Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation

Arxiv

0+阅读 · 10月23日

SHRUMS: Sensor Hallucination for Real-time Underwater Motion Planning with a Compact 3D Sonar

Arxiv

0+阅读 · 10月21日

From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering

Arxiv

0+阅读 · 10月21日

Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards

Arxiv

0+阅读 · 10月21日

VIP会员

文章信息

相关主题

相关VIP内容

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

星链与未来战争

《黑蜂（Black Hummingbird）微型无人机》

《全球地缘政治环境中的反无人机系统互操作性》252页

《美国：为自动驾驶汽车铺平道路——未来出行已来》最新43页报告

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

相关论文

Balancing Specialization and Centralization: A Multi-Agent Reinforcement Learning Benchmark for Sequential Industrial Control

Arxiv

0+阅读 · 10月23日

Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation

Arxiv

0+阅读 · 10月23日

SHRUMS: Sensor Hallucination for Real-time Underwater Motion Planning with a Compact 3D Sonar

Arxiv

0+阅读 · 10月21日

From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering

Arxiv

0+阅读 · 10月21日

Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards

Arxiv

0+阅读 · 10月21日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员