直接训练深度脉冲Q网络实现人类级控制 (Human-Level Control through Directly-Trained Deep Spiking Q-Networks) - 专知论文

会员服务 ·

0

脉冲 · Atari · 能效 · 深度Q网络 · 神经元 ·

2023 年 4 月 11 日

Human-Level Control through Directly-Trained Deep Spiking Q-Networks

翻译：直接训练深度脉冲Q网络实现人类级控制

Guisong Liu,Wenjie Deng,Xiurui Xie,Li Huang,Huajin Tang

from arxiv, Accepted by IEEE Transactions on Cybernetics

As the third-generation neural networks, Spiking Neural Networks (SNNs) have great potential on neuromorphic hardware because of their high energy-efficiency. However, Deep Spiking Reinforcement Learning (DSRL), i.e., the Reinforcement Learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the non-differentiable property of the spiking function. To address these issues, we propose a Deep Spiking Q-Network (DSQN) in this paper. Specifically, we propose a directly-trained deep spiking reinforcement learning architecture based on the Leaky Integrate-and-Fire (LIF) neurons and Deep Q-Network (DQN). Then, we adapt a direct spiking learning algorithm for the Deep Spiking Q-Network. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, robustness and energy-efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly-trained SNN.

翻译：作为第三代神经网络，脉冲神经网络(SNNs)由于其高能效性，在神经形态硬件上具有巨大潜力。然而，基于SNN的深度脉冲强化学习(DSRL)仍处于初步阶段，因为脉冲函数的二元输出和不可微分性。为了解决这些问题，在本文中，我们提出了一个深度脉冲Q网络(DSQN)。具体而言，我们提出了一种基于渗漏整合和发放(LIF)神经元和深度Q网络(DQN)的直接训练深度脉冲强化学习架构。然后，我们为深度脉冲Q网络适应了一种直接脉冲学习算法。我们进一步从理论上证明了在DSQN中使用LIF神经元的优势。我们在17个表现最佳的Atari游戏上进行了广泛的实验，以比较我们的方法和最先进的转换方法。实验结果证明了我们的方法在性能、稳定性、鲁棒性和能效方面的优越性。据我们所知，我们的工作是第一个在直接训练的SNN上实现多个Atari游戏的最先进性能的示例。

0

相关内容

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【ICCV 2021】HCFlow：使用一个统一的框架处理图像超分辨率和图像再缩放

专知会员服务

15+阅读 · 2021年10月4日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

专知会员服务

68+阅读 · 2020年5月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【论文】生成式教学网络:通过学习生成合成训练数据来加速神经结构搜索（Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data）

【论文】生成式教学网络:通过学习生成合成训练数据来加速神经结构搜索（Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data）

专知会员服务

14+阅读 · 2019年11月17日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

状态依赖时滞耦合拥塞控制系统的振荡与同步

国家自然科学基金

0+阅读 · 2015年12月31日

日球电流片对行星际扰动影响的模拟研究

国家自然科学基金

0+阅读 · 2015年12月31日

光致形状记忆薄壳结构多场耦合及非接触式控制

国家自然科学基金

0+阅读 · 2014年12月31日

神经元和星形胶质细胞特异性miRNA对神经网络发育和功能的调控机制

国家自然科学基金

1+阅读 · 2013年12月31日

环形自激振荡射流涡激空化效应及对卷吸性能的影响机理

国家自然科学基金

0+阅读 · 2012年12月31日

无线网络中跨周界环境区分的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-15b/cdc42/p53环路在胶质瘤细胞凋亡过程中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

噪音环境下移动进程的可靠性与安全性

国家自然科学基金

0+阅读 · 2009年12月31日

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

Arxiv

0+阅读 · 2023年5月29日

A joint estimation approach for monotonic regression functions in general dimensions

Arxiv

0+阅读 · 2023年5月28日

A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks

Arxiv

0+阅读 · 2023年5月26日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【ICCV 2021】HCFlow：使用一个统一的框架处理图像超分辨率和图像再缩放

专知会员服务

15+阅读 · 2021年10月4日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

专知会员服务

68+阅读 · 2020年5月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【论文】生成式教学网络:通过学习生成合成训练数据来加速神经结构搜索（Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data）

【论文】生成式教学网络:通过学习生成合成训练数据来加速神经结构搜索（Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data）

专知会员服务

14+阅读 · 2019年11月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

Arxiv

0+阅读 · 2023年5月29日

A joint estimation approach for monotonic regression functions in general dimensions

Arxiv

0+阅读 · 2023年5月28日

A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks

Arxiv

0+阅读 · 2023年5月26日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

状态依赖时滞耦合拥塞控制系统的振荡与同步

国家自然科学基金

0+阅读 · 2015年12月31日

日球电流片对行星际扰动影响的模拟研究

国家自然科学基金

0+阅读 · 2015年12月31日

光致形状记忆薄壳结构多场耦合及非接触式控制

国家自然科学基金

0+阅读 · 2014年12月31日

神经元和星形胶质细胞特异性miRNA对神经网络发育和功能的调控机制

国家自然科学基金

1+阅读 · 2013年12月31日

环形自激振荡射流涡激空化效应及对卷吸性能的影响机理

国家自然科学基金

0+阅读 · 2012年12月31日

无线网络中跨周界环境区分的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-15b/cdc42/p53环路在胶质瘤细胞凋亡过程中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

噪音环境下移动进程的可靠性与安全性

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员