QHD: 大脑激发的超多维强化学习算法 (QHD: A brain-inspired hyperdimensional reinforcement learning algorithm) - 专知论文

会员服务 ·

0

Learning · 记忆容量 · Batch Size · 强化学习 · 回合 ·

2022 年 7 月 25 日

QHD: A brain-inspired hyperdimensional reinforcement learning algorithm

翻译：QHD: 大脑激发的超多维强化学习算法

Yang Ni,Danny Abraham,Mariam Issa,Yeseong Kim,Pietro Mercati,Mohsen Imani

Reinforcement Learning (RL) has opened up new opportunities to solve a wide range of complex decision-making tasks. However, modern RL algorithms, e.g., Deep Q-Learning, are based on deep neural networks, putting high computational costs when running on edge devices. In this paper, we propose QHD, a Hyperdimensional Reinforcement Learning, that mimics brain properties toward robust and real-time learning. QHD relies on a lightweight brain-inspired model to learn an optimal policy in an unknown environment. We first develop a novel mathematical foundation and encoding module that maps state-action space into high-dimensional space. We accordingly develop a hyperdimensional regression model to approximate the Q-value function. The QHD-powered agent makes decisions by comparing Q-values of each possible action. We evaluate the effect of the different RL training batch sizes and local memory capacity on the QHD quality of learning. Our QHD is also capable of online learning with tiny local memory capacity, which can be as small as the training batch size. QHD provides real-time learning by further decreasing the memory capacity and the batch size. This makes QHD suitable for highly-efficient reinforcement learning in the edge environment, where it is crucial to support online and real-time learning. Our solution also supports a small experience replay batch size that provides 12.3 times speedup compared to DQN while ensuring minimal quality loss. Our evaluation shows QHD capability for real-time learning, providing 34.6 times speedup and significantly better quality of learning than state-of-the-art deep RL algorithms.

翻译：强化学习(RL)为解决一系列复杂的决策任务开辟了新的机会。然而,现代RL算法,例如深Q学习(Deep Q-Learning),以深神经网络为基础,在边缘设备运行时计算成本高。在本文件中,我们建议QHD(超多维强化学习),将大脑特性模拟成强力和实时学习。QHD(QHD)依靠一个轻巧的大脑启发型模型,在一个未知的环境中学习最佳政策。我们首先开发一个新的数学基础和编码模块,将国家行动空间映射为高维空间。我们相应地开发了一个超维度回归模型,以近似于Q值功能。QHD(HD)动力代理商通过比较每种可能行动的Q值作出决定。我们评估了不同的RL培训批量规模和地方记忆能力对QHD质量学习质量的影响。我们QHD(QHD)还能够以小的本地记忆能力进行在线学习,这种能力可以像培训批量规模那样小。QHD(HD)提供实时学习模型,通过进一步降低升级质量支持我们的在线学习能力,同时提供我们的升级学习能力, QL(QL) QQQ) 和批次级学习能力,从而提供我们至关重要的升级的升级的学习。

0

相关内容

Learning

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

高体击穿强度线性介质陶瓷的制备及物性研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向弹性用户的异构Small Cell网络动态资源优化与管控方法

国家自然科学基金

1+阅读 · 2014年12月31日

脆性X综合症模型小鼠雌激素ER-β调节突触可塑性异常的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向时空路网疏散的群体行为态势挖掘与演化研究

国家自然科学基金

0+阅读 · 2013年12月31日

强流质子束与固体靶相互作用的数值模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Mn基钙钛矿氧化物阴极A位稀土离子高温电子输运机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems

Arxiv

0+阅读 · 2022年9月19日

Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control

Arxiv

0+阅读 · 2022年9月19日

Feature Selection integrated Deep Learning for Ultrahigh Dimensional and Highly Correlated Feature Space

Feature Selection integrated Deep Learning for Ultrahigh Dimensional and Highly Correlated Feature Space

Arxiv

0+阅读 · 2022年9月18日

VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning

Arxiv

0+阅读 · 2022年9月17日

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Arxiv

0+阅读 · 2022年9月15日

Understanding Deep Neural Function Approximation in Reinforcement Learning via $ε$-Greedy Exploration

Arxiv

0+阅读 · 2022年9月15日

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Arxiv

0+阅读 · 2022年9月15日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems

Arxiv

0+阅读 · 2022年9月19日

Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control

Arxiv

0+阅读 · 2022年9月19日

Feature Selection integrated Deep Learning for Ultrahigh Dimensional and Highly Correlated Feature Space

Feature Selection integrated Deep Learning for Ultrahigh Dimensional and Highly Correlated Feature Space

Arxiv

0+阅读 · 2022年9月18日

VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning

Arxiv

0+阅读 · 2022年9月17日

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Learning Multi-agent Options for Tabular Reinforcement Learning using Factor Graphs

Arxiv

0+阅读 · 2022年9月15日

Understanding Deep Neural Function Approximation in Reinforcement Learning via $ε$-Greedy Exploration

Arxiv

0+阅读 · 2022年9月15日

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

Arxiv

0+阅读 · 2022年9月15日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

相关基金

高体击穿强度线性介质陶瓷的制备及物性研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向弹性用户的异构Small Cell网络动态资源优化与管控方法

国家自然科学基金

1+阅读 · 2014年12月31日

脆性X综合症模型小鼠雌激素ER-β调节突触可塑性异常的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向时空路网疏散的群体行为态势挖掘与演化研究

国家自然科学基金

0+阅读 · 2013年12月31日

强流质子束与固体靶相互作用的数值模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Mn基钙钛矿氧化物阴极A位稀土离子高温电子输运机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员