TASIL: 泰勒系列模拟学习 (TaSIL: Taylor Series Imitation Learning) - 专知论文

会员服务 ·

0

泰勒级数 · 泰勒 · Learning · 损失 · Continuity ·

2023 年 1 月 16 日

TaSIL: Taylor Series Imitation Learning

翻译：TASIL: 泰勒系列模拟学习

Daniel Pfrommer,Thomas T. C. K. Zhang,Stephen Tu,Nikolai Matni

from arxiv, Appeared at NeurIPS 2022. V2: added to related work, updated notation, fixed small errors in appendix

We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control. TaSIL penalizes deviations in the higher-order Taylor series terms between the learned and expert policies. We show that experts satisfying a notion of $\textit{incremental input-to-state stability}$ are easy to learn, in the sense that a small TaSIL-augmented imitation loss over expert trajectories guarantees a small imitation loss over trajectories generated by the learned policy. We provide sample-complexity bounds for TaSIL that scale as $\tilde{\mathcal{O}}(1/n)$ in the realizable setting, for $n$ the number of expert demonstrations. Finally, we demonstrate experimentally the relationship between the robustness of the expert policy and the order of Taylor expansion required in TaSIL, and compare standard Behavior Cloning, DART, and DAgger with TaSIL-loss-augmented variants. In all cases, we show significant improvement over baselines across a variety of MuJoCo tasks.

翻译：我们建议泰勒系列模拟学习(TasIL),这是在持续控制的背景下对标准行为克隆损失的简单补充。TasIL惩罚高阶泰勒系列在所学政策和专家政策之间的偏差。我们表明,专家满足了美元(textitit)/incial Increate-pination-to State sustainable)概念,这是很容易了解的,因为与专家轨迹相比,微小的TasIL(TasIL)的模拟损失保证了在所学政策产生的轨迹上的微小模仿损失。我们提供了TasIL(TasIL)的样本复杂性约束,在可实现的环境下,以美元计为1美元(n)美元,用于专家演示数量。最后,我们实验性地展示了专家政策的稳健性和TasIL(TaSIL)所要求的泰勒扩展顺序之间的关系,并将标准Behavircloning、DART和Dagger(Dagger)与TasIL(M)损失变式比较。我们在所有情况下都显示在一系列任务的基准上显著改进了。

0

相关内容

泰勒级数

泰勒级数的定义若函数f（x）在点的某一邻域内具有直到（n+1）阶导数，则在该邻域内f（x）的n阶泰勒公式为： f(x)=f(x0)+f`( x0)(x- x0)+f``( x0)(x-x0)²/2!+f```( x0)(x- x0)³/3!+...fn(x0)(x- x0)n/n!+.... 其中：fn(x0)(x- x0)n/n!，称为拉格朗日余项。以上函数展开式称为泰勒级数。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

树上生灭过程收敛速度及p-Laplacian特征值估计

国家自然科学基金

0+阅读 · 2015年12月31日

黏弹性流体力学中分数阶微分方程解的适定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

相场方程的弱超内罚间断Galerkin方法及其自适应算法

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

化疗诱导的细胞衰老在神经母细胞瘤复发中的作用及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

振荡型积分的有界性质及其在色散方程中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

某些随机非线性发展方程组的动力学行为

国家自然科学基金

0+阅读 · 2013年12月31日

常微分方程中的几个经典问题

国家自然科学基金

2+阅读 · 2012年12月31日

On the Fusion Strategies for Federated Decision Making

On the Fusion Strategies for Federated Decision Making

Arxiv

0+阅读 · 2023年3月10日

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction

Arxiv

0+阅读 · 2023年3月9日

Faster Adaptive Federated Learning

Arxiv

0+阅读 · 2023年3月9日

Learning Exploration Strategies to Solve Real-World Marble Runs

Arxiv

0+阅读 · 2023年3月8日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

Learning Eco-Driving Strategies at Signalized Intersections

Arxiv

16+阅读 · 2022年4月26日

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research

Arxiv

39+阅读 · 2021年8月30日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

On the Fusion Strategies for Federated Decision Making

On the Fusion Strategies for Federated Decision Making

Arxiv

0+阅读 · 2023年3月10日

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction

Arxiv

0+阅读 · 2023年3月9日

Faster Adaptive Federated Learning

Arxiv

0+阅读 · 2023年3月9日

Learning Exploration Strategies to Solve Real-World Marble Runs

Arxiv

0+阅读 · 2023年3月8日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

Learning Eco-Driving Strategies at Signalized Intersections

Arxiv

16+阅读 · 2022年4月26日

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research

Arxiv

39+阅读 · 2021年8月30日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

树上生灭过程收敛速度及p-Laplacian特征值估计

国家自然科学基金

0+阅读 · 2015年12月31日

黏弹性流体力学中分数阶微分方程解的适定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

相场方程的弱超内罚间断Galerkin方法及其自适应算法

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

化疗诱导的细胞衰老在神经母细胞瘤复发中的作用及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

振荡型积分的有界性质及其在色散方程中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

某些随机非线性发展方程组的动力学行为

国家自然科学基金

0+阅读 · 2013年12月31日

常微分方程中的几个经典问题

国家自然科学基金

2+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员