【AAAI2021】Lipschitz终生强化学习 - 专知

会员服务 ·

0

【AAAI2021】Lipschitz终生强化学习

2020 年 12 月 14 日 专知

我们研究了智能体在面临一系列强化学习任务时的知识转移问题。在马尔可夫决策过程之间引入了一种新的度量方法，证明了封闭式多目标决策具有封闭式最优值函数。形式上，最优值函数是关于任务空间的Lipschitz连续函数。根据这些理论结果，我们提出了一种终身RL的值转移方法，并利用该方法建立了一种收敛速度较好的PAC-MDP算法。我们在终身RL实验中说明了该方法的好处。

https://www.zhuanzhi.ai/paper/031fb6db56a53d5fc61281f327beddd5

专知便捷查看

便捷下载，请关注专知公众号（点击上方蓝色专知关注）

后台回复“LLRL” 就可以获取《【AAAI2021】Lipschitz终生强化学习》专知下载链接

专知，专业可信的人工智能知识分发，让认知协作更快更好！欢迎注册登录专知www.zhuanzhi.ai，获取5000+AI主题干货知识资料！

欢迎微信扫一扫加入专知人工智能知识星球群，获取最新AI专业干货知识教程资料和与专家交流咨询！

点击“ 阅读原文 ”，了解使用专知 ，查看获取5000+AI主题知识资源

登录查看更多

4

相关内容

Lipschitz

【CVPR2021】面向视频动作分割的高效网络结构搜索

【CVPR2021】面向视频动作分割的高效网络结构搜索

专知会员服务

14+阅读 · 2021年3月14日

「元学习」最新AAAI2021-Tutorial，附视频与240页ppt

「元学习」最新AAAI2021-Tutorial，附视频与240页ppt

专知会员服务

117+阅读 · 2021年2月7日

【AAAI2021最佳论文】多智能体学习中的探索 - 利用

【AAAI2021最佳论文】多智能体学习中的探索 - 利用

专知会员服务

36+阅读 · 2021年2月6日

【Yoshua Bengio】因果表示学习，附视频与72页ppt

【Yoshua Bengio】因果表示学习，附视频与72页ppt

专知会员服务

76+阅读 · 2021年1月7日

【AAAI2021】层次图胶囊网络

【AAAI2021】层次图胶囊网络

专知会员服务

84+阅读 · 2020年12月18日

【AAAI2021】Lipschitz终身强化学习

专知会员服务

31+阅读 · 2020年12月14日

【AAAI2021】小样本学习多标签意图检测

【AAAI2021】小样本学习多标签意图检测

专知会员服务

56+阅读 · 2020年12月8日

【ICML2020】基于模型的强化学习方法教程，279页ppt

【ICML2020】基于模型的强化学习方法教程，279页ppt

专知会员服务

129+阅读 · 2020年7月20日

【CVPR2020】我们能用强化学习来学习图模型推断的启发规则吗?

专知会员服务

43+阅读 · 2020年5月5日

【ICML 2019 | 元学习教程】伯克利Chelsea Finn、Sergey Levine主讲，附111PDF

【ICML 2019 | 元学习教程】伯克利Chelsea Finn、Sergey Levine主讲，附111PDF

专知会员服务

54+阅读 · 2019年11月12日

【AAAI2021】近似梯度下降的学习图神经网络

【AAAI2021】近似梯度下降的学习图神经网络

专知

8+阅读 · 2020年12月9日

【快讯】AAAI2021结果出炉，1692篇上榜，你的paper中了吗？

【快讯】AAAI2021结果出炉，1692篇上榜，你的paper中了吗？

专知

14+阅读 · 2020年12月2日

【ICML 2020 】小样本学习即领域迁移

【ICML 2020 】小样本学习即领域迁移

专知

5+阅读 · 2020年6月26日

KDD2020接受论文列表！338篇论文都在这了

KDD2020接受论文列表！338篇论文都在这了

专知

20+阅读 · 2020年6月26日

【牛津大学&DeepMind】自监督学习教程，141页ppt

【牛津大学&DeepMind】自监督学习教程，141页ppt

专知

16+阅读 · 2020年5月29日

【快讯】KDD2020论文出炉，216篇上榜，你的paper中了吗？

【快讯】KDD2020论文出炉，216篇上榜，你的paper中了吗？

专知

11+阅读 · 2020年5月16日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

专知

48+阅读 · 2020年3月30日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

经典书《斯坦福大学-多智能体系统》532页pdf

经典书《斯坦福大学-多智能体系统》532页pdf

专知

121+阅读 · 2020年1月29日

Continuously Indexed Domain Adaptation

Arxiv

8+阅读 · 2020年8月30日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Lipschitz Lifelong Reinforcement Learning

Arxiv

4+阅读 · 2020年1月17日

Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning

Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年3月25日

Lipschitz Generative Adversarial Nets

Arxiv

8+阅读 · 2019年2月15日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

Deep Generative Classifiers for Thoracic Disease Diagnosis with Chest X-ray Images

Deep Generative Classifiers for Thoracic Disease Diagnosis with Chest X-ray Images

Arxiv

3+阅读 · 2018年11月8日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

DARTS: Differentiable Architecture Search

Arxiv

3+阅读 · 2018年6月24日

VIP会员

相关主题

马尔可夫决策过程

相关VIP内容

【CVPR2021】面向视频动作分割的高效网络结构搜索

【CVPR2021】面向视频动作分割的高效网络结构搜索

专知会员服务

14+阅读 · 2021年3月14日

「元学习」最新AAAI2021-Tutorial，附视频与240页ppt

「元学习」最新AAAI2021-Tutorial，附视频与240页ppt

专知会员服务

117+阅读 · 2021年2月7日

【AAAI2021最佳论文】多智能体学习中的探索 - 利用

【AAAI2021最佳论文】多智能体学习中的探索 - 利用

专知会员服务

36+阅读 · 2021年2月6日

【Yoshua Bengio】因果表示学习，附视频与72页ppt

【Yoshua Bengio】因果表示学习，附视频与72页ppt

专知会员服务

76+阅读 · 2021年1月7日

【AAAI2021】层次图胶囊网络

【AAAI2021】层次图胶囊网络

专知会员服务

84+阅读 · 2020年12月18日

【AAAI2021】Lipschitz终身强化学习

专知会员服务

31+阅读 · 2020年12月14日

【AAAI2021】小样本学习多标签意图检测

【AAAI2021】小样本学习多标签意图检测

专知会员服务

56+阅读 · 2020年12月8日

【ICML2020】基于模型的强化学习方法教程，279页ppt

【ICML2020】基于模型的强化学习方法教程，279页ppt

专知会员服务

129+阅读 · 2020年7月20日

【CVPR2020】我们能用强化学习来学习图模型推断的启发规则吗?

专知会员服务

43+阅读 · 2020年5月5日

【ICML 2019 | 元学习教程】伯克利Chelsea Finn、Sergey Levine主讲，附111PDF

【ICML 2019 | 元学习教程】伯克利Chelsea Finn、Sergey Levine主讲，附111PDF

专知会员服务

54+阅读 · 2019年11月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】多目标奖励与偏好优化：理论与算法

《无形的防御者？将定向能武器集成到反无人机框架的机遇与挑战》报告

自主化海军：海上无人系统与未来海战

迈向智能体系统规模化的科学

相关资讯

【AAAI2021】近似梯度下降的学习图神经网络

【AAAI2021】近似梯度下降的学习图神经网络

专知

8+阅读 · 2020年12月9日

【快讯】AAAI2021结果出炉，1692篇上榜，你的paper中了吗？

【快讯】AAAI2021结果出炉，1692篇上榜，你的paper中了吗？

专知

14+阅读 · 2020年12月2日

【ICML 2020 】小样本学习即领域迁移

【ICML 2020 】小样本学习即领域迁移

专知

5+阅读 · 2020年6月26日

KDD2020接受论文列表！338篇论文都在这了

KDD2020接受论文列表！338篇论文都在这了

专知

20+阅读 · 2020年6月26日

【牛津大学&DeepMind】自监督学习教程，141页ppt

【牛津大学&DeepMind】自监督学习教程，141页ppt

专知

16+阅读 · 2020年5月29日

【快讯】KDD2020论文出炉，216篇上榜，你的paper中了吗？

【快讯】KDD2020论文出炉，216篇上榜，你的paper中了吗？

专知

11+阅读 · 2020年5月16日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

专知

48+阅读 · 2020年3月30日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

经典书《斯坦福大学-多智能体系统》532页pdf

经典书《斯坦福大学-多智能体系统》532页pdf

专知

121+阅读 · 2020年1月29日

相关论文

Continuously Indexed Domain Adaptation

Arxiv

8+阅读 · 2020年8月30日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Lipschitz Lifelong Reinforcement Learning

Arxiv

4+阅读 · 2020年1月17日

Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning

Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年3月25日

Lipschitz Generative Adversarial Nets

Arxiv

8+阅读 · 2019年2月15日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

Deep Generative Classifiers for Thoracic Disease Diagnosis with Chest X-ray Images

Deep Generative Classifiers for Thoracic Disease Diagnosis with Chest X-ray Images

Arxiv

3+阅读 · 2018年11月8日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

DARTS: Differentiable Architecture Search

Arxiv

3+阅读 · 2018年6月24日

大家都在搜

大型语言模型

蓝牙安全攻防

朱克爱德华兹家族

精排模型-从MLP到行为序列：DIN、DIEN、MIMN、SIM、DSIN

微信扫码咨询专知VIP会员