具有依赖历史的动态背景的强化学习 (Reinforcement Learning with History-Dependent Dynamic Contexts) - 专知论文

会员服务 ·

0

Learning · 强化学习 · Markov · motivation · CASES ·

2023 年 2 月 4 日

Reinforcement Learning with History-Dependent Dynamic Contexts

翻译：具有依赖历史的动态背景的强化学习

Guy Tennenholtz,Nadav Merlis,Lior Shani,Martin Mladenov,Craig Boutilier

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts change over time. We consider special cases of the model, with a focus on logistic DCMDPs, which break the exponential dependence on history length by leveraging aggregation functions to determine context transitions. This special structure allows us to derive an upper-confidence-bound style algorithm for which we establish regret bounds. Motivated by our theoretical results, we introduce a practical model-based algorithm for logistic DCMDPs that plans in a latent space and uses optimism over history-dependent features. We demonstrate the efficacy of our approach on a recommendation task (using MovieLens data) where user behavior dynamics evolve in response to recommendations.

翻译：我们引入了动态背景Markov 决策程序(DCMDPs),这是一个具有历史依赖性的环境的新强化学习框架,它概括了背景的 MDP 框架,以处理环境随时间变化而变化的非马尔科夫环境。我们考虑了模型的特殊案例,重点是后勤的 DCMDPs,它通过利用聚合功能决定背景转型,打破了对历史长度的指数依赖。这一特殊结构使我们能够获得一种具有高度信任性的算法,我们为此建立了悔恨界限。根据我们的理论结果,我们引入了一种基于逻辑的基于模型的逻辑计算法,用于在潜在空间中进行规划,并使用对历史依赖特征的乐观。我们展示了我们对建议任务(使用MovesLens数据)的处理方法的有效性,即用户行为动态根据建议演变。

0

相关内容

Learning

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

肥胖相关Hepatokine LECT2在肝脏中的调控及机制

国家自然科学基金

1+阅读 · 2015年12月31日

宽带隙氮/氧化物广义掺杂规则及对载流子特性的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

限域于介孔金属-有机框架内的金属纳米颗粒催化C-H活化反应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于PCE的多层多域光网络QoS组播路由多目标优化算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

动态云环境中基于SLA的工作流调度

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

RIP在致癌剂诱导NF-κ#27963;化及肺上皮细胞恶性转化中的作用机理的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

Hippo-YAP信号通路在hPCSC分化过程中的作用和机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

服务质量要素驱动的空间信息服务组合执行技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

Model-Based Reinforcement Learning with Isolated Imaginations

Arxiv

0+阅读 · 2023年3月27日

Robotic Packaging Optimization with Reinforcement Learning

Arxiv

0+阅读 · 2023年3月26日

Learning with Explanation Constraints

Arxiv

0+阅读 · 2023年3月25日

Multi-Task Reinforcement Learning in Continuous Control with Successor Feature-Based Concurrent Composition

Arxiv

0+阅读 · 2023年3月24日

RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research

Arxiv

0+阅读 · 2023年3月23日

Reinforcement Learning with Exogenous States and Rewards

Arxiv

0+阅读 · 2023年3月22日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Model-Based Reinforcement Learning with Isolated Imaginations

Arxiv

0+阅读 · 2023年3月27日

Robotic Packaging Optimization with Reinforcement Learning

Arxiv

0+阅读 · 2023年3月26日

Learning with Explanation Constraints

Arxiv

0+阅读 · 2023年3月25日

Multi-Task Reinforcement Learning in Continuous Control with Successor Feature-Based Concurrent Composition

Arxiv

0+阅读 · 2023年3月24日

RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research

Arxiv

0+阅读 · 2023年3月23日

Reinforcement Learning with Exogenous States and Rewards

Arxiv

0+阅读 · 2023年3月22日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

肥胖相关Hepatokine LECT2在肝脏中的调控及机制

国家自然科学基金

1+阅读 · 2015年12月31日

宽带隙氮/氧化物广义掺杂规则及对载流子特性的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

限域于介孔金属-有机框架内的金属纳米颗粒催化C-H活化反应的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于PCE的多层多域光网络QoS组播路由多目标优化算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

动态云环境中基于SLA的工作流调度

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

RIP在致癌剂诱导NF-κ#27963;化及肺上皮细胞恶性转化中的作用机理的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

Hippo-YAP信号通路在hPCSC分化过程中的作用和机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

服务质量要素驱动的空间信息服务组合执行技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员