变化量量深Q-网络中无法覆盖的不稳定 (Uncovering Instabilities in Variational-Quantum Deep Q-Networks) - 专知论文

会员服务 ·

0

Q网络` · 深度Q网络 · Learning · 可辨认的 · Performer ·

2022 年 9 月 16 日

Uncovering Instabilities in Variational-Quantum Deep Q-Networks

翻译：变化量量深Q-网络中无法覆盖的不稳定

Maja Franz,Lucas Wolf,Maniraman Periyasamy,Christian Ufrecht,Daniel D. Scherer,Axel Plinge,Christopher Mutschler,Wolfgang Mauerer

from arxiv, Authors Maja Franz, Lucas Wolf, Maniraman Periyasamy contributed equally (name order randomised). To be published in the Journal of The Franklin Institute

Deep Reinforcement Learning (RL) has considerably advanced over the past decade. At the same time, state-of-the-art RL algorithms require a large computational budget in terms of training time to converge. Recent work has started to approach this problem through the lens of quantum computing, which promises theoretical speed-ups for several traditionally hard tasks. In this work, we examine a class of hybrid quantum-classical RL algorithms that we collectively refer to as variational quantum deep Q-networks (VQ-DQN). We show that VQ-DQN approaches are subject to instabilities that cause the learned policy to diverge, study the extent to which this afflicts reproduciblity of established results based on classical simulation, and perform systematic experiments to identify potential explanations for the observed instabilities. Additionally, and in contrast to most existing work on quantum reinforcement learning, we execute RL algorithms on an actual quantum processing unit (an IBM Quantum Device) and investigate differences in behaviour between simulated and physical quantum systems that suffer from implementation deficiencies. Our experiments show that, contrary to opposite claims in the literature, it cannot be conclusively decided if known quantum approaches, even if simulated without physical imperfections, can provide an advantage as compared to classical approaches. Finally, we provide a robust, universal and well-tested implementation of VQ-DQN as a reproducible testbed for future experiments.

翻译：在过去十年里,深入强化学习(RL)取得了相当的进展。与此同时,先进的RL算法需要大量的计算预算,在培训时间方面需要大量计算预算。最近的工作已经开始从量子计算的角度来处理这一问题,这有可能为一些传统的艰巨任务带来理论加速。在这项工作中,我们检查了一组混合量子古典RL算法,我们统称为变量量深Q网络(VQ-DQN),我们发现,VQ-DQN 方法存在不稳定性,导致所学政策出现差异,研究这在多大程度上会影响基于经典模拟的既定结果的不正统性,并进行系统实验以确定观察到的不稳定性的潜在解释。此外,与大多数现有的量子强化学习工作相比,我们用一个实际量子处理单位(IBM Qunantum 设备)来进行RL算法,并调查因执行缺陷而受损害的模拟和物理量子系统之间的行为差异。我们的实验显示,即使与经典模拟实验方法相反,如果我们最终能够提供一种不完善的测试性的方法,那么,我们作为模拟的物理文献的模拟性试验中的一种最终也无法提供一种测试性的方法。

0

相关内容

Q网络`

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Keggin型多酸基磁性材料的自旋传输机理理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Caldesmon调节血管平滑肌细胞参与血管内膜增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤细胞中凋亡抑制蛋白CFLAR乙酰化调控的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

多巴胺受体对α/β1-、AT1受体抑制作用在高血压病发生中的作用和机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

妊娠期哮喘诱导子代大鼠肾上腺髓质细胞向交感神经元转变机制

国家自然科学基金

0+阅读 · 2008年12月31日

肝纤维化恢复期TRAIL对星状细胞增殖的调控

国家自然科学基金

0+阅读 · 2008年12月31日

Learning to predict arbitrary quantum processes

Arxiv

0+阅读 · 2022年10月26日

Quantum deep recurrent reinforcement learning

Arxiv

0+阅读 · 2022年10月26日

Coresets for Vertical Federated Learning: Regularized Linear Regression and $K$-Means Clustering

Arxiv

0+阅读 · 2022年10月26日

Quantum Semi-Supervised Learning with Quantum Supremacy

Arxiv

0+阅读 · 2022年10月24日

MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences

Arxiv

0+阅读 · 2022年10月23日

A Q# Implementation of a Quantum Lookup Table for Quantum Arithmetic Functions

Arxiv

0+阅读 · 2022年10月21日

Learning Robust Dynamics through Variational Sparse Gating

Arxiv

0+阅读 · 2022年10月21日

Quantum Algorithms for Geologic Fracture Networks

Arxiv

0+阅读 · 2022年10月21日

Quantum Machine Learning using the ZXW-Calculus

Arxiv

0+阅读 · 2022年10月18日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Learning to predict arbitrary quantum processes

Arxiv

0+阅读 · 2022年10月26日

Quantum deep recurrent reinforcement learning

Arxiv

0+阅读 · 2022年10月26日

Coresets for Vertical Federated Learning: Regularized Linear Regression and $K$-Means Clustering

Arxiv

0+阅读 · 2022年10月26日

Quantum Semi-Supervised Learning with Quantum Supremacy

Arxiv

0+阅读 · 2022年10月24日

MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences

Arxiv

0+阅读 · 2022年10月23日

A Q# Implementation of a Quantum Lookup Table for Quantum Arithmetic Functions

Arxiv

0+阅读 · 2022年10月21日

Learning Robust Dynamics through Variational Sparse Gating

Arxiv

0+阅读 · 2022年10月21日

Quantum Algorithms for Geologic Fracture Networks

Arxiv

0+阅读 · 2022年10月21日

Quantum Machine Learning using the ZXW-Calculus

Arxiv

0+阅读 · 2022年10月18日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Keggin型多酸基磁性材料的自旋传输机理理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Caldesmon调节血管平滑肌细胞参与血管内膜增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤细胞中凋亡抑制蛋白CFLAR乙酰化调控的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

多巴胺受体对α/β1-、AT1受体抑制作用在高血压病发生中的作用和机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

妊娠期哮喘诱导子代大鼠肾上腺髓质细胞向交感神经元转变机制

国家自然科学基金

0+阅读 · 2008年12月31日

肝纤维化恢复期TRAIL对星状细胞增殖的调控

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员