与更快再培训的初始网络进行软动作- 动态- Critical (Soft Actor-Critic with Inhibitory Networks for Faster Retraining) - 专知论文

会员服务 ·

0

SOFT · Networking · 学成 · SAC · 控制器 ·

2022 年 2 月 8 日

Soft Actor-Critic with Inhibitory Networks for Faster Retraining

翻译：与更快再培训的初始网络进行软动作- 动态- Critical

Jaime S. Ide,Daria Mićović,Michael J. Guarino,Kevin Alcedo,David Rosenbluth,Adrian P. Pope

from arxiv, 16 pages including Appendix

Reusing previously trained models is critical in deep reinforcement learning to speed up training of new agents. However, it is unclear how to acquire new skills when objectives and constraints are in conflict with previously learned skills. Moreover, when retraining, there is an intrinsic conflict between exploiting what has already been learned and exploring new skills. In soft actor-critic (SAC) methods, a temperature parameter can be dynamically adjusted to weight the action entropy and balance the explore $\times$ exploit trade-off. However, controlling a single coefficient can be challenging within the context of retraining, even more so when goals are contradictory. In this work, inspired by neuroscience research, we propose a novel approach using inhibitory networks to allow separate and adaptive state value evaluations, as well as distinct automatic entropy tuning. Ultimately, our approach allows for controlling inhibition to handle conflict between exploiting less risky, acquired behaviors and exploring novel ones to overcome more challenging tasks. We validate our method through experiments in OpenAI Gym environments.

翻译：在深入强化学习以加快培训新代理人方面,必须重新使用以前训练过的模型,但尚不清楚在目标和制约因素与以前学到的技能发生冲突时如何获得新的技能。此外,在再培训时,利用已经学到的东西与探索新的技能之间存在内在冲突。在软性行为者-批评(SAC)方法中,可以动态地调整温度参数以权重行动激流和平衡探索利用的权衡。然而,在再培训方面,控制单一系数可能具有挑战性,在目标相互矛盾时尤其如此。在这项工作中,在神经科学研究的启发下,我们提议采用新颖的方法,利用抑制性网络,允许进行独立和适应性的国家价值评价,以及独特的自动加密调节。最终,我们的方法允许控制抑制利用风险较小、获得的行为和探索新颖的行为来克服更具挑战性的任务之间的冲突。我们通过在OpenAI Gym环境中的实验来验证我们的方法。

0

相关内容

SOFT

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

含铱配合物的肿瘤缺氧敏感的近红外高分子探针

国家自然科学基金

0+阅读 · 2014年12月31日

基于知识迁移的跨领域人体动作识别

国家自然科学基金

5+阅读 · 2013年12月31日

Affordance辅助服务机器人识别形状不规则物体研究

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于集值函数描述的移动机器人自主行为基础问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于视频分析的儿童行为研究

国家自然科学基金

1+阅读 · 2011年12月31日

复杂疾病的单体型关联分析方法

国家自然科学基金

0+阅读 · 2009年12月31日

烟草内生菌多样性研究

国家自然科学基金

0+阅读 · 2008年12月31日

An unsupervised approach for semantic place annotation of trajectories based on the prior probability

Arxiv

0+阅读 · 2022年4月20日

Exploiting Multiple EEG Data Domains with Adversarial Learning

Arxiv

0+阅读 · 2022年4月16日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

VIP会员

文章信息

相关主题

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

An unsupervised approach for semantic place annotation of trajectories based on the prior probability

Arxiv

0+阅读 · 2022年4月20日

Exploiting Multiple EEG Data Domains with Adversarial Learning

Arxiv

0+阅读 · 2022年4月16日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

相关基金

含铱配合物的肿瘤缺氧敏感的近红外高分子探针

国家自然科学基金

0+阅读 · 2014年12月31日

基于知识迁移的跨领域人体动作识别

国家自然科学基金

5+阅读 · 2013年12月31日

Affordance辅助服务机器人识别形状不规则物体研究

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于集值函数描述的移动机器人自主行为基础问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于视频分析的儿童行为研究

国家自然科学基金

1+阅读 · 2011年12月31日

复杂疾病的单体型关联分析方法

国家自然科学基金

0+阅读 · 2009年12月31日

烟草内生菌多样性研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员