学习和理解强化学习中隐藏参数的分解特征代表 (Learning and Understanding a Disentangled Feature Representation for Hidden Parameters in Reinforcement Learning) - 专知论文

会员服务 ·

0

可理解性 · Learning · 回合 · 特征空间 · RNN ·

2022 年 11 月 29 日

Learning and Understanding a Disentangled Feature Representation for Hidden Parameters in Reinforcement Learning

翻译：学习和理解强化学习中隐藏参数的分解特征代表

Christopher Reale,Rebecca Russell

from arxiv, Appears in Proceedings of AAAI FSS-22 Symposium "Lessons Learned for Autonomous Assessment of Machine Abilities (LLAAMA)"

Hidden parameters are latent variables in reinforcement learning (RL) environments that are constant over the course of a trajectory. Understanding what, if any, hidden parameters affect a particular environment can aid both the development and appropriate usage of RL systems. We present an unsupervised method to map RL trajectories into a feature space where distance represents the relative difference in system behavior due to hidden parameters. Our approach disentangles the effects of hidden parameters by leveraging a recurrent neural network (RNN) world model as used in model-based RL. First, we alter the standard world model training algorithm to isolate the hidden parameter information in the world model memory. Then, we use a metric learning approach to map the RNN memory into a space with a distance metric approximating a bisimulation metric with respect to the hidden parameters. The resulting disentangled feature space can be used to meaningfully relate trajectories to each other and analyze the hidden parameter. We demonstrate our approach on four hidden parameters across three RL environments. Finally we present two methods to help identify and understand the effects of hidden parameters on systems.

翻译：隐藏参数是在轨迹中常态的强化学习环境( RL) 中的隐性变量。了解哪些隐性参数影响特定环境, 可以帮助开发并适当使用 RL 系统。我们展示了一种不受监督的方法, 将 RL 轨迹映射到一个特性空间, 其间距离代表了系统行为中因隐藏参数而产生的相对差异。我们的方法通过在基于模型的 RL 中使用的经常性神经网络( RNN) 世界模型模型模型, 来分离隐藏参数信息。首先, 我们改变标准的世界模型培训算法, 以分离世界模型记忆中的隐性参数信息。然后, 我们使用一种衡量学习方法, 将 RNN 内存映射成一个空间, 与隐藏参数相近, 并用一个校准的参数测量空间。由此产生的分解特性空间可以用来将轨迹与其它参数进行有意义的连接, 分析隐藏参数。我们用四个隐性参数在基于模型的参数上展示了我们的方法。最后, 我们用两种方法来帮助识别和理解隐藏参数对系统的影响。

0

相关内容

可理解性

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

专知会员服务

44+阅读 · 2022年3月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

miR-140促进CYP2J2基因表达对动脉粥样硬化中血管炎症的调控作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

proBDNF通过P75NTR/sortilin受体促进心肌缺血再灌注损伤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

超导磁体多场耦合非线性力学行为与超导性能的相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

FMNL3基因在结直肠癌转移中的作用及其信号转导通路

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于半乳糖凝集蛋白-3的高表达MUC1乳腺癌转移机制研究及靶点药物的发现

国家自然科学基金

0+阅读 · 2012年12月31日

SIRT1在酒精致糖尿病发病中机制及白藜芦醇干预

国家自然科学基金

0+阅读 · 2011年12月31日

Learning Risk-Aware Costmaps via Inverse Reinforcement Learning for Off-Road Navigation

Arxiv

0+阅读 · 2023年1月31日

CRC-RL: A Novel Visual Feature Representation Architecture for Unsupervised Reinforcement Learning

Arxiv

0+阅读 · 2023年1月31日

Few-Shot Image-to-Semantics Translation for Policy Transfer in Reinforcement Learning

Arxiv

0+阅读 · 2023年1月31日

Reinforcement Learning from Diverse Human Preferences

Arxiv

0+阅读 · 2023年1月30日

Flip Initial Features: Generalization of Neural Networks Under Sparse Features for Semi-supervised Node Classification

Arxiv

0+阅读 · 2023年1月30日

A Deep Learning Method for Comparing Bayesian Hierarchical Models

Arxiv

0+阅读 · 2023年1月27日

Challenging Common Assumptions in Convex Reinforcement Learning

Arxiv

0+阅读 · 2023年1月27日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

VIP会员

文章信息

相关主题

相关VIP内容

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

专知会员服务

44+阅读 · 2022年3月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Learning Risk-Aware Costmaps via Inverse Reinforcement Learning for Off-Road Navigation

Arxiv

0+阅读 · 2023年1月31日

CRC-RL: A Novel Visual Feature Representation Architecture for Unsupervised Reinforcement Learning

Arxiv

0+阅读 · 2023年1月31日

Few-Shot Image-to-Semantics Translation for Policy Transfer in Reinforcement Learning

Arxiv

0+阅读 · 2023年1月31日

Reinforcement Learning from Diverse Human Preferences

Arxiv

0+阅读 · 2023年1月30日

Flip Initial Features: Generalization of Neural Networks Under Sparse Features for Semi-supervised Node Classification

Arxiv

0+阅读 · 2023年1月30日

A Deep Learning Method for Comparing Bayesian Hierarchical Models

Arxiv

0+阅读 · 2023年1月27日

Challenging Common Assumptions in Convex Reinforcement Learning

Arxiv

0+阅读 · 2023年1月27日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

相关基金

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

miR-140促进CYP2J2基因表达对动脉粥样硬化中血管炎症的调控作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

proBDNF通过P75NTR/sortilin受体促进心肌缺血再灌注损伤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

超导磁体多场耦合非线性力学行为与超导性能的相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

FMNL3基因在结直肠癌转移中的作用及其信号转导通路

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于半乳糖凝集蛋白-3的高表达MUC1乳腺癌转移机制研究及靶点药物的发现

国家自然科学基金

0+阅读 · 2012年12月31日

SIRT1在酒精致糖尿病发病中机制及白藜芦醇干预

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员