成本发生反向变化的时速最短路径 (Stochastic Shortest Path with Adversarially Changing Costs) - 专知论文

会员服务 ·

0

INTERACT · 泛函 · 代价函数 · 极小点 · 路径 ·

2021 年 4 月 29 日

Stochastic Shortest Path with Adversarially Changing Costs

翻译：成本发生反向变化的时速最短路径

Aviv Rosenberg,Yishay Mansour

Stochastic shortest path (SSP) is a well-known problem in planning and control, in which an agent has to reach a goal state in minimum total expected cost. In this paper we present the adversarial SSP model that also accounts for adversarial changes in the costs over time, while the underlying transition function remains unchanged. Formally, an agent interacts with an SSP environment for $K$ episodes, the cost function changes arbitrarily between episodes, and the transitions are unknown to the agent. We develop the first algorithms for adversarial SSPs and prove high probability regret bounds of $\widetilde O (\sqrt{K})$ assuming all costs are strictly positive, and $\widetilde O (K^{3/4})$ in the general case. We are the first to consider this natural setting of adversarial SSP and obtain sub-linear regret for it.

翻译：最短的托盘路径(SSP)在规划和控制方面是一个众所周知的问题,在这种路径中,代理人必须达到一个最低预期总成本的目标状态。在本文中,我们介绍了对抗性SSP模式,该模式也反映了长期成本的对抗性变化,而基本过渡功能保持不变。形式上,代理人与SSP环境发生互动,出现K美元事件,费用函数在时间和过渡之间任意变化,代理商不知道。我们为对抗性SSP开发了第一个算法,并证明,假设所有成本都绝对是正数,而美元(sqrt{K})和美元(k ⁇ 3/4})在一般情况下为美元,则极有可能后悔。我们首先考虑对抗性SSP的自然环境,并为此获得亚线性遗憾。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

【IJCAI2020】图神经网络预测结构化实体交互

【IJCAI2020】图神经网络预测结构化实体交互

专知会员服务

43+阅读 · 2020年5月13日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Minimax Problems

Arxiv

0+阅读 · 2021年6月18日

Communication-Efficient Distributed Optimization with Quantized Preconditioners

Arxiv

0+阅读 · 2021年6月17日

Generalized regression operator estimation for continuous time functional data processes with missing at random response

Arxiv

0+阅读 · 2021年6月17日

On the Power of Preconditioning in Sparse Linear Regression

Arxiv

0+阅读 · 2021年6月17日

Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes

Arxiv

0+阅读 · 2021年6月16日

Graph Neural Networks Inspired by Classical Iterative Algorithms

Arxiv

0+阅读 · 2021年6月16日

Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path

Arxiv

0+阅读 · 2021年6月15日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Variance Reduction Methods for Sublinear Reinforcement Learning

Arxiv

4+阅读 · 2018年4月25日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

【IJCAI2020】图神经网络预测结构化实体交互

【IJCAI2020】图神经网络预测结构化实体交互

专知会员服务

43+阅读 · 2020年5月13日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

【斯坦福博士论文】数据、决策与依赖：构建可信人工智能的挑战

人工智能时代背景下的未来海战

接触战中的无人机优势：美军旅级部队面临的小型无人机系统挑战与调整

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Minimax Problems

Arxiv

0+阅读 · 2021年6月18日

Communication-Efficient Distributed Optimization with Quantized Preconditioners

Arxiv

0+阅读 · 2021年6月17日

Generalized regression operator estimation for continuous time functional data processes with missing at random response

Arxiv

0+阅读 · 2021年6月17日

On the Power of Preconditioning in Sparse Linear Regression

Arxiv

0+阅读 · 2021年6月17日

Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes

Arxiv

0+阅读 · 2021年6月16日

Graph Neural Networks Inspired by Classical Iterative Algorithms

Arxiv

0+阅读 · 2021年6月16日

Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path

Arxiv

0+阅读 · 2021年6月15日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Variance Reduction Methods for Sublinear Reinforcement Learning

Arxiv

4+阅读 · 2018年4月25日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

微信扫码咨询专知VIP会员