随机动作对随机政策: 启动基于模型的直接政策搜索 (Random Actions vs Random Policies: Bootstrapping Model-Based Direct Policy Search) - 专知论文

会员服务 ·

0

策略搜索 · 自助法/自举法 · MoDELS · Performer · 有向 ·

2022 年 10 月 21 日

Random Actions vs Random Policies: Bootstrapping Model-Based Direct Policy Search

翻译：随机动作对随机政策: 启动基于模型的直接政策搜索

Elias Hanna,Alex Coninx,Stéphane Doncieux

from arxiv, ICML 2022 Workshop Adaptive Experimental Design and Active Learning in the Real World

This paper studies the impact of the initial data gathering method on the subsequent learning of a dynamics model. Dynamics models approximate the true transition function of a given task, in order to perform policy search directly on the model rather than on the costly real system. This study aims to determine how to bootstrap a model as efficiently as possible, by comparing initialization methods employed in two different policy search frameworks in the literature. The study focuses on the model performance under the episode-based framework of Evolutionary methods using probabilistic ensembles. Experimental results show that various task-dependant factors can be detrimental to each method, suggesting to explore hybrid approaches.

翻译：本文研究了初步数据收集方法对随后学习动态模型的影响。动态模型与特定任务的真正过渡功能相近,以便直接对模型而不是对昂贵的实际系统进行政策搜索。本研究报告的目的是通过比较文献中两个不同的政策搜索框架采用的初始化方法,确定如何尽可能高效地捕捉模型。本研究报告侧重于在以事件为基础的进化方法框架下使用概率组合的模型性能。实验结果显示,各种任务依赖因素可能对每种方法都有害,建议探索混合方法。

0

相关内容

策略搜索

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

45+阅读 · 2015年12月31日

牛蒡子中Arctignan A，Lappaol C及其衍生物的合成和抗白血病活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

DGKε/SNARE信号通路在糖尿病肾病足细胞胰岛素抵抗中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

天然产物Artanomalide D及其类似物的全合成和抗肿瘤构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

S1P联合PR-MSCs移植在治疗小鼠急性心肌梗死中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

新型金属-有机骨架化合物基超级电容器电极材料的制备与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于n型ZnO纳米线阵列/p型铜铁矿结构薄膜的波长可调谐发光二极管制备、特性和机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

sRAGE对缺血/再灌注的心脏保护作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

Winning the CityLearn Challenge: Adaptive Optimization with Evolutionary Search under Trajectory-based Guidance

Arxiv

0+阅读 · 2022年12月4日

Counterfactual Learning with General Data-generating Policies

Arxiv

0+阅读 · 2022年12月4日

Regularized ERM on random subspaces

Arxiv

0+阅读 · 2022年12月4日

Multivariate Quantile Function Forecaster

Arxiv

0+阅读 · 2022年12月4日

Diverse Generative Perturbations on Attention Space for Transferable Adversarial Attacks

Arxiv

0+阅读 · 2022年12月2日

Flexible social inference facilitates targeted social learning when rewards are not observable

Arxiv

0+阅读 · 2022年12月1日

A Model-based GNN for Learning Precoding

Arxiv

0+阅读 · 2022年12月1日

Efficient variational approximations for state space models

Arxiv

0+阅读 · 2022年11月30日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

自助法/自举法

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】用于化学结构抽取的多模态文档理解

《人工智能在军事行动作战规划过程中的应用可能性》

【NeurIPS2025】熵正则化与分布强化学习的收敛定理

智能体安全综述：应用、威胁与防御

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Winning the CityLearn Challenge: Adaptive Optimization with Evolutionary Search under Trajectory-based Guidance

Arxiv

0+阅读 · 2022年12月4日

Counterfactual Learning with General Data-generating Policies

Arxiv

0+阅读 · 2022年12月4日

Regularized ERM on random subspaces

Arxiv

0+阅读 · 2022年12月4日

Multivariate Quantile Function Forecaster

Arxiv

0+阅读 · 2022年12月4日

Diverse Generative Perturbations on Attention Space for Transferable Adversarial Attacks

Arxiv

0+阅读 · 2022年12月2日

Flexible social inference facilitates targeted social learning when rewards are not observable

Arxiv

0+阅读 · 2022年12月1日

A Model-based GNN for Learning Precoding

Arxiv

0+阅读 · 2022年12月1日

Efficient variational approximations for state space models

Arxiv

0+阅读 · 2022年11月30日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

45+阅读 · 2015年12月31日

牛蒡子中Arctignan A，Lappaol C及其衍生物的合成和抗白血病活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

DGKε/SNARE信号通路在糖尿病肾病足细胞胰岛素抵抗中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

天然产物Artanomalide D及其类似物的全合成和抗肿瘤构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

S1P联合PR-MSCs移植在治疗小鼠急性心肌梗死中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

新型金属-有机骨架化合物基超级电容器电极材料的制备与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于n型ZnO纳米线阵列/p型铜铁矿结构薄膜的波长可调谐发光二极管制备、特性和机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

sRAGE对缺血/再灌注的心脏保护作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员