贴现过去 (Discounting the Past) - 专知论文

会员服务 ·

0

Weight · 优化器 · INTERACT · 滑动窗口 · MoDELS ·

2021 年 10 月 20 日

Discounting the Past

翻译：贴现过去

Taylor Dohmen,Ashutosh Trivedi

Stochastic games with discounted payoff, introduced by Shapley, model adversarial interactions in stochastic environments where two players try to optimize a discounted sum of rewards. In this model, long-term weights are geometrically attenuated based on the delay in their occurrence. We propose a temporally dual notion -- called past-discounting -- where agents have geometrically decaying memory of the rewards encountered during a play of the game. We study objective functions based on past-discounted weight sequences and examine the corresponding stochastic games with liminf, discounted, and mean payoffs. For objectives specified as the limit inferior of past-discounted reward sequences, we show that positional determinacy fails and that optimal strategies may require unbounded memory. To overcome this obstacle, we study an approximate windowed objective based on the idea of using sliding windows of finite length to examine infinite plays. On the other hand, for objectives specified as the discounted and average limits of past-discounted reward sequences we establish determinacy in mixed stationary strategies in the setting of concurrent stochastic games and show how the values of these games may be computed via reductions to standard discounted and mean-payoff games.

翻译：由Shatley介绍的、模拟对抗性互动在随机环境中的游戏, 有两个玩家试图优化贴现奖励的折扣。在这个模型中, 长期的重量是几何性的减慢, 其发生时间的延迟。我们提出了一个时间性的双重概念 -- -- 叫做过去贴现 -- -- 其代理商对游戏游戏中遇到的奖赏记忆的几何性衰减。我们根据过去折扣的重量序列研究客观的功能, 并检查相应的悬浮、贴现和平均报酬的相近性游戏。对于作为过去折扣奖励序列下限的目标, 我们显示定位确定性失灵, 最佳策略可能需要无限制的记忆。为了克服这一障碍, 我们研究一个近似窗口化的目标, 其基础是使用有限长度的滑动窗口来检查无限游戏。另一方面, 为了确定过去折扣和平均的奖赏序列的折扣和平均限度, 我们在设定同时折扣游戏和标准游戏的混合固定策略中设定了确定性。

0

相关内容

Weight

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

最新《深度学习人脸识别》综述论文，

最新《深度学习人脸识别》综述论文，

专知会员服务

68+阅读 · 2020年8月10日

深度学习目标检测方法综述

深度学习目标检测方法综述

专知会员服务

280+阅读 · 2020年8月1日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

专知会员服务

51+阅读 · 2019年12月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

谷歌足球游戏环境使用介绍

谷歌足球游戏环境使用介绍

CreateAMind

33+阅读 · 2019年6月27日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Mitigating the Bias of Centered Objects in Common Datasets

Arxiv

0+阅读 · 2021年12月16日

Inexact Newton combined approximations in the topology optimization of geometrically nonlinear elastic structures and compliant mechanisms

Arxiv

0+阅读 · 2021年12月16日

Towards Personalization of User Preferences in Partially Observable Smart Home Environments

Arxiv

0+阅读 · 2021年12月15日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Graph Neural Networks Inspired by Classical Iterative Algorithms

Graph Neural Networks Inspired by Classical Iterative Algorithms

Arxiv

4+阅读 · 2021年3月10日

A Survey of Deep Meta-Learning

Arxiv

8+阅读 · 2020年10月7日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

相关VIP内容

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

最新《深度学习人脸识别》综述论文，

最新《深度学习人脸识别》综述论文，

专知会员服务

68+阅读 · 2020年8月10日

深度学习目标检测方法综述

深度学习目标检测方法综述

专知会员服务

280+阅读 · 2020年8月1日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

专知会员服务

51+阅读 · 2019年12月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

超越机械控制：神经形态军事人工智能中的因果决策处理

《构建战略杀伤力：美军联合部队学习与领导者发展的特种作战模型》

《元宇宙在军事领域的应用》

《乌克兰战场联合兵种机动的新兴方法》最新报告

相关资讯

谷歌足球游戏环境使用介绍

谷歌足球游戏环境使用介绍

CreateAMind

33+阅读 · 2019年6月27日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Mitigating the Bias of Centered Objects in Common Datasets

Arxiv

0+阅读 · 2021年12月16日

Inexact Newton combined approximations in the topology optimization of geometrically nonlinear elastic structures and compliant mechanisms

Arxiv

0+阅读 · 2021年12月16日

Towards Personalization of User Preferences in Partially Observable Smart Home Environments

Arxiv

0+阅读 · 2021年12月15日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Graph Neural Networks Inspired by Classical Iterative Algorithms

Graph Neural Networks Inspired by Classical Iterative Algorithms

Arxiv

4+阅读 · 2021年3月10日

A Survey of Deep Meta-Learning

Arxiv

8+阅读 · 2020年10月7日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员