制定足够的统计基础战略 (LP Formulations of sufficient statistic based strategies in Finite Horizon Two-Player Zero-Sum Stochastic Bayesian games) - 专知论文

会员服务 ·

0

统计量 · Performer · 优化器 · Networks · 传感器 ·

2021 年 5 月 4 日

LP Formulations of sufficient statistic based strategies in Finite Horizon Two-Player Zero-Sum Stochastic Bayesian games

翻译：制定足够的统计基础战略

Nabiha Nasir Orpa,Lichun Li

This paper studies two-player zero-sum stochastic Bayesian games where each player has its own dynamic state that is unknown to the other player. Using typical techniques, we provide the recursive formulas and sufficient statistics in both the primal game and its dual games. It's also shown that with a specific initial parameter, the optimal strategy of one player in a dual game is also the optimal strategy of the player in the primal game. To deal with the long finite Bayesian game we have provided an algorithm to compute the sub-optimal strategies of the players step by step to avoid the LP complexity. For this, we computed LPs to find the special initial parameters in the dual games and update the sufficient statistics of the dual games. The performance analysis has provided an upper bound on the performance difference between the optimal and suboptimal strategies. The main results are demonstrated in a security problem of underwater sensor networks.

翻译：本文研究了两个玩家零和随机贝叶西亚游戏, 每个玩家都有自己的动态状态, 而另一个玩家不知道。我们使用典型的技巧, 在原始游戏及其双向游戏中提供循环公式和足够的统计数据。它还显示, 使用一个特定的初始参数, 一个玩家在双向游戏中的最佳策略也是玩家在原始游戏中的最佳策略。为了处理长期有限的巴伊西亚游戏, 我们提供了一个算法, 以一步步计算玩家的亚最佳策略, 以避免 LP 复杂程度。为此, 我们计算LP 以在双向游戏中找到特殊的初始参数, 并更新双向游戏的充足统计数据。性能分析为最佳策略和亚最佳策略之间的性能差异提供了一个上限。主要结果表现在水下传感器网络的安全问题中。

0

相关内容

统计量

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

专知会员服务

155+阅读 · 2021年5月9日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Joint parametric specification checking of conditional mean and volatility in time series models with martingale difference innovations

Arxiv

0+阅读 · 2021年7月1日

Scalar conservation laws with stochastic discontinuous flux function

Arxiv

0+阅读 · 2021年7月1日

A fast subsampling method for estimating the distribution of signal-to-noise ratio statistics in nonparametric time series regression models

Arxiv

0+阅读 · 2021年7月1日

Higher Order Targeted Maximum Likelihood Estimation

Higher Order Targeted Maximum Likelihood Estimation

Arxiv

0+阅读 · 2021年6月30日

On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging

Arxiv

0+阅读 · 2021年6月30日

Some i-Mark games

Arxiv

0+阅读 · 2021年6月30日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

专知会员服务

155+阅读 · 2021年5月9日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

反无人机：乌克兰拦截型无人机系列一览

《自适应鲁棒马尔可夫决策过程：协同作战飞机（CCA）对抗性监视任务应用》44页技术报告

物理学中的高级深度学习

观点动力学：全面综述

相关资讯

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Joint parametric specification checking of conditional mean and volatility in time series models with martingale difference innovations

Arxiv

0+阅读 · 2021年7月1日

Scalar conservation laws with stochastic discontinuous flux function

Arxiv

0+阅读 · 2021年7月1日

A fast subsampling method for estimating the distribution of signal-to-noise ratio statistics in nonparametric time series regression models

Arxiv

0+阅读 · 2021年7月1日

Higher Order Targeted Maximum Likelihood Estimation

Higher Order Targeted Maximum Likelihood Estimation

Arxiv

0+阅读 · 2021年6月30日

On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging

Arxiv

0+阅读 · 2021年6月30日

Some i-Mark games

Arxiv

0+阅读 · 2021年6月30日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

微信扫码咨询专知VIP会员