缺乏反馈的顺序决定问题 (Sequential Decision Problems with Weak Feedback) - 专知论文

会员服务 ·

0

推断 · 损失 · Performer · Bandits · Networks ·

2022 年 12 月 22 日

Sequential Decision Problems with Weak Feedback

翻译：缺乏反馈的顺序决定问题

from arxiv, Ph.D. Thesis

This thesis considers sequential decision problems, where the loss/reward incurred by selecting an action may not be inferred from observed feedback. A major part of this thesis focuses on the unsupervised sequential selection problem, where one can not infer the loss incurred for selecting an action from observed feedback. We also introduce a new setup named Censored Semi Bandits, where the loss incurred for selecting an action can be observed under certain conditions. Finally, we study the channel selection problem in the communication networks, where the reward for an action is only observed when no other player selects that action to play in the round. These problems find applications in many fields like healthcare, crowd-sourcing, security, adaptive resource allocation, among many others. This thesis aims to address the above-described sequential decision problems by exploiting specific structures these problems exhibit. We develop provably optimal algorithms for each of these setups with weak feedback and validate their empirical performance on different problem instances derived from synthetic and real datasets.

翻译：该论文考虑了顺序决定问题,其中选择行动所造成的损失/回报不能从观察到的反馈中推断出来。本论文的主要部分侧重于未经监督的顺序选择问题,其中无法推断从观察到的反馈中选择行动所造成的损失。我们还引入了一个新的设置,名为Cenored Sime Bridits, 在某些条件下可以观察到选择行动所造成的损失。最后,我们研究了通信网络中的频道选择问题,只有在没有其他参与者选择该行动在回合中发挥作用时才观察到该行动的奖赏。这些问题在许多领域,如保健、众包、安全、适应性资源分配等,都存在应用问题。本论文的目的是通过利用这些问题所展示的具体结构,解决上面描述的顺序决定问题。我们为每一个这些组合开发了最优化的算法,对来自合成和真实数据集的不同问题案例的经验性表现进行了验证。

4

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

非线性Schordinger方程及其相关问题的变分方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

氧化石墨烯/多组份导电聚合物超分子复合体系气敏特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

GSK-3β/β-catenin信号通路参与ARDS后认知功能障碍发生的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

低交叉极化共形天线阵列综合的混合DE算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

去乙酰化酶SIRT1在血管内皮细胞氧化低密度脂蛋白代谢中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

内皮细胞功能关键指标的高通量筛查及其在ED早期预警中作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Preisach算子的动力电池开路电压滞回效应建模及其多时间尺度在线估计

国家自然科学基金

0+阅读 · 2012年12月31日

酵母甘露聚糖构象变化与免疫活性增强机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

强磁场对铁磁合金调幅分解行为的影响及机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes

Arxiv

0+阅读 · 2023年2月22日

Risk Aware Adaptive Belief-dependent Probabilistically Constrained Continuous POMDP Planning

Arxiv

0+阅读 · 2023年2月21日

Energy-Optimal Sampling for Edge Computing Feedback Systems: Aperiodic Case

Arxiv

0+阅读 · 2023年2月21日

Chain of Hindsight Aligns Language Models with Feedback

Arxiv

0+阅读 · 2023年2月20日

Efficient Algorithms for Boundary Defense with Heterogeneous Defenders

Arxiv

0+阅读 · 2023年2月20日

Estimating Optimal Policy Value in General Linear Contextual Bandits

Arxiv

0+阅读 · 2023年2月19日

Collocation methods for second and higher order systems

Arxiv

0+阅读 · 2023年2月17日

To Switch or not to Switch: Predicting the Benefit of Switching between Algorithms based on Trajectory Features

Arxiv

0+阅读 · 2023年2月17日

Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning

Arxiv

0+阅读 · 2023年2月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes

Arxiv

0+阅读 · 2023年2月22日

Risk Aware Adaptive Belief-dependent Probabilistically Constrained Continuous POMDP Planning

Arxiv

0+阅读 · 2023年2月21日

Energy-Optimal Sampling for Edge Computing Feedback Systems: Aperiodic Case

Arxiv

0+阅读 · 2023年2月21日

Chain of Hindsight Aligns Language Models with Feedback

Arxiv

0+阅读 · 2023年2月20日

Efficient Algorithms for Boundary Defense with Heterogeneous Defenders

Arxiv

0+阅读 · 2023年2月20日

Estimating Optimal Policy Value in General Linear Contextual Bandits

Arxiv

0+阅读 · 2023年2月19日

Collocation methods for second and higher order systems

Arxiv

0+阅读 · 2023年2月17日

To Switch or not to Switch: Predicting the Benefit of Switching between Algorithms based on Trajectory Features

Arxiv

0+阅读 · 2023年2月17日

Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning

Arxiv

0+阅读 · 2023年2月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

相关基金

非线性Schordinger方程及其相关问题的变分方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

氧化石墨烯/多组份导电聚合物超分子复合体系气敏特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

GSK-3β/β-catenin信号通路参与ARDS后认知功能障碍发生的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

低交叉极化共形天线阵列综合的混合DE算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

去乙酰化酶SIRT1在血管内皮细胞氧化低密度脂蛋白代谢中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

内皮细胞功能关键指标的高通量筛查及其在ED早期预警中作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Preisach算子的动力电池开路电压滞回效应建模及其多时间尺度在线估计

国家自然科学基金

0+阅读 · 2012年12月31日

酵母甘露聚糖构象变化与免疫活性增强机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

强磁场对铁磁合金调幅分解行为的影响及机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员