部分可部分遵守的最低年龄:贪婪政策 (Partially Observable Minimum-Age Scheduling: The Greedy Policy) - 专知论文

会员服务 ·

0

贪心 · INFORMS · Processing（编程语言） · 样本 · 部分可观测马尔可夫决策过程 ·

2021 年 3 月 3 日

Partially Observable Minimum-Age Scheduling: The Greedy Policy

翻译：部分可部分遵守的最低年龄:贪婪政策

Yulin Shao,Qi Cao,Soung Chang Liew,He Chen

from arxiv, 16 pages, 10 figures

This paper studies the minimum-age scheduling problem in a wireless sensor network where an access point (AP) monitors the state of an object via a set of sensors. The freshness of the sensed state, measured by the age-of-information (AoI), varies at different sensors and is not directly observable to the AP. The AP has to decide which sensor to query/sample in order to get the most updated state information of the object (i.e., the state information with the minimum AoI). In this paper, we formulate the minimum-age scheduling problem as a multi-armed bandit problem with partially observable arms and explore the greedy policy to minimize the expected AoI sampled over an infinite horizon. To analyze the performance of the greedy policy, we 1) put forth a relaxed greedy policy that decouples the sampling processes of the arms, 2) formulate the sampling process of each arm as a partially observable Markov decision process (POMDP), and 3) derive the average sampled AoI under the relaxed greedy policy as a sum of the average AoI sampled from individual arms. Numerical and simulation results validate that the relaxed greedy policy is an excellent approximation to the greedy policy in terms of the expected AoI sampled over an infinite horizon.

翻译：本文研究无线传感器网络中的最小年龄排程问题,在这个网络中,一个接入点(AP)通过一组传感器监测物体状态。根据信息年龄(AoI)衡量的感知状态的新鲜程度在不同传感器上各不相同,并且无法直接观察AP。AP必须决定哪个传感器来查询/抽样,以便获得该物体的最新状态信息(即国家与最低AoI的资料)。在本文中,我们将最低年龄排程问题作为多武装匪盗用部分可部分可观察武器监测的多武装问题,并探索将预期的AoI抽样在无限地平地范围内进行最小化的贪婪政策。为了分析贪婪政策的执行情况,我们1)提出了宽松的贪婪政策,将武器取样过程分开,2 将每个手臂的取样过程作为部分可部分可观察的Markov决策过程(POMDP),3 根据宽松的贪婪政策,将平均抽样的AoI作为从个别武器中抽取的平均AoI政策的总和。Numerical和模拟结果证实,贪婪政策是贪婪政策的预期前景。

0

相关内容

【DeepMind教程】蒙特卡罗树搜索，60页ppt

专知会员服务

59+阅读 · 2021年4月7日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

强化学习扫盲贴：从Q-learning到DQN

强化学习扫盲贴：从Q-learning到DQN

夕小瑶的卖萌屋

52+阅读 · 2019年10月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

一个强化学习 Q-learning 算法的简明教程

一个强化学习 Q-learning 算法的简明教程

数据挖掘入门与实战

10+阅读 · 2018年3月18日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Conservative Safety Critics for Exploration

Arxiv

0+阅读 · 2021年4月26日

Decentralized Edge-to-Cloud Load-balancing: Service Placement for the Internet of Things

Arxiv

0+阅读 · 2021年4月23日

Decentralized Federated Averaging

Arxiv

0+阅读 · 2021年4月23日

Decentralized Multi-Agents by Imitation of a Centralized Controller

Arxiv

0+阅读 · 2021年4月22日

Minimizing the Sum of Age of Information and Transmission Cost under Stochastic Arrival Model

Arxiv

0+阅读 · 2021年4月22日

An Online Scheduling Algorithm for Energy Minimization in Wireless Powered Mobile Edge Computing Networks

Arxiv

0+阅读 · 2021年4月22日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Arxiv

3+阅读 · 2018年7月16日

Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management

Arxiv

4+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

Processing（编程语言）

部分可观测马尔可夫决策过程

相关VIP内容

【DeepMind教程】蒙特卡罗树搜索，60页ppt

专知会员服务

59+阅读 · 2021年4月7日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

机器人领域中最佳的三维场景表示是什么？——从几何表示到基础模型

《多域作战兵棋推演：运用形态学分析与人工智能加强国防人员训练》

【博士论文】快速高效的归一化流及其在图像生成模型中的应用

仿生机器人技术的军事应用

相关资讯

强化学习扫盲贴：从Q-learning到DQN

强化学习扫盲贴：从Q-learning到DQN

夕小瑶的卖萌屋

52+阅读 · 2019年10月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

一个强化学习 Q-learning 算法的简明教程

一个强化学习 Q-learning 算法的简明教程

数据挖掘入门与实战

10+阅读 · 2018年3月18日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Conservative Safety Critics for Exploration

Arxiv

0+阅读 · 2021年4月26日

Decentralized Edge-to-Cloud Load-balancing: Service Placement for the Internet of Things

Arxiv

0+阅读 · 2021年4月23日

Decentralized Federated Averaging

Arxiv

0+阅读 · 2021年4月23日

Decentralized Multi-Agents by Imitation of a Centralized Controller

Arxiv

0+阅读 · 2021年4月22日

Minimizing the Sum of Age of Information and Transmission Cost under Stochastic Arrival Model

Arxiv

0+阅读 · 2021年4月22日

An Online Scheduling Algorithm for Energy Minimization in Wireless Powered Mobile Edge Computing Networks

Arxiv

0+阅读 · 2021年4月22日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Arxiv

3+阅读 · 2018年7月16日

Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management

Arxiv

4+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员