良好行为者可以以较小尺寸出现:关于行为者-批评对称性价值的案例研究 (Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry) - 专知论文

会员服务 ·

0

CASE · 评论员 · Performer · Networking · 可约的 ·

2021 年 2 月 23 日

Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry

翻译：良好行为者可以以较小尺寸出现:关于行为者-批评对称性价值的案例研究

Siddharth Mysore,Bassel Mabsout,Renato Mancuso,Kate Saenko

Actors and critics in actor-critic reinforcement learning algorithms are functionally separate, yet they often use the same network architectures. This case study explores the performance impact of network sizes when considering actor and critic architectures independently. By relaxing the assumption of architectural symmetry, it is often possible for smaller actors to achieve comparable policy performance to their symmetric counterparts. Our experiments show up to 97% reduction in the number of network weights with an average reduction of 64% over multiple algorithms on multiple tasks. Given the practical benefits of reducing actor complexity, we believe configurations of actors and critics are aspects of actor-critic design that deserve to be considered independently.

翻译：行为体强化学习算法的参与者和批评者在功能上是分开的,但他们经常使用相同的网络架构。本案例研究在独立考虑行为体和评论者架构时,探讨了网络规模的绩效影响。通过放松对建筑对称的假设,较小行为体往往有可能实现与其对称对应方相似的政策绩效。我们的实验显示,网络加权数减少高达97%,比多重任务多重算法平均减少64%。考虑到降低行为体复杂性的实际好处,我们认为行为体和批评者构成的构成是行为体-批评设计中值得独立考虑的方面。

0

相关内容

CASE

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年4月19日

Overfitting in Bayesian Optimization: an empirical study and early-stopping solution

Arxiv

0+阅读 · 2021年4月16日

Accurate Discretization Of Poroelasticity Without Darcy Stability -- Stokes-Biot Stability Revisited

Arxiv

0+阅读 · 2021年4月16日

On the TOCTOU Problem in Remote Attestation

Arxiv

0+阅读 · 2021年4月15日

A Self-Tuning Actor-Critic Algorithm

Arxiv

0+阅读 · 2021年4月14日

Decomposed Soft Actor-Critic Method for Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年4月14日

Look at my new blue force-sensing shoes!

Arxiv

0+阅读 · 2021年4月14日

OneVision: Centralized to Distributed Controller Synthesis with Delay Compensation

Arxiv

0+阅读 · 2021年4月14日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management

Arxiv

4+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

扩散语言模型综述

《美陆军徒步机动作战条令手册》最新168页

【博士论文】理解神经网络的训练动态：从局部优化轨迹与特征学习视角

军事后勤数字化未来展望

相关资讯

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年4月19日

Overfitting in Bayesian Optimization: an empirical study and early-stopping solution

Arxiv

0+阅读 · 2021年4月16日

Accurate Discretization Of Poroelasticity Without Darcy Stability -- Stokes-Biot Stability Revisited

Arxiv

0+阅读 · 2021年4月16日

On the TOCTOU Problem in Remote Attestation

Arxiv

0+阅读 · 2021年4月15日

A Self-Tuning Actor-Critic Algorithm

Arxiv

0+阅读 · 2021年4月14日

Decomposed Soft Actor-Critic Method for Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年4月14日

Look at my new blue force-sensing shoes!

Arxiv

0+阅读 · 2021年4月14日

OneVision: Centralized to Distributed Controller Synthesis with Delay Compensation

Arxiv

0+阅读 · 2021年4月14日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management

Arxiv

4+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员