编入预算和未编入预算的因果强盗 (Budgeted and Non-budgeted Causal Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 图 · Performer · Better · 学成 ·

2020 年 12 月 13 日

Budgeted and Non-budgeted Causal Bandits

翻译：编入预算和未编入预算的因果强盗

Vineet Nair,Vishakha Patil,Gaurav Sinha

Learning good interventions in a causal graph can be modelled as a stochastic multi-armed bandit problem with side-information. First, we study this problem when interventions are more expensive than observations and a budget is specified. If there are no backdoor paths from an intervenable node to the reward node then we propose an algorithm to minimize simple regret that optimally trades-off observations and interventions based on the cost of intervention. We also propose an algorithm that accounts for the cost of interventions, utilizes causal side-information, and minimizes the expected cumulative regret without exceeding the budget. Our cumulative-regret minimization algorithm performs better than standard algorithms that do not take side-information into account. Finally, we study the problem of learning best interventions without budget constraint in general graphs and give an algorithm that achieves constant expected cumulative regret in terms of the instance parameters when the parent distribution of the reward variable for each intervention is known. Our results are experimentally validated and compared to the best-known bounds in the current literature.

翻译：在因果图中学习良好的干预措施可以仿照以附带信息为模型的随机多武装匪徒问题。首先,我们研究这个问题,当干预比观察更昂贵,并且指定了预算时,我们研究这个问题。如果从一个互交节点到奖赏节点没有后门路径,那么我们建议一种算法,以尽量减少对根据干预成本进行最佳交换的观察和干预措施的简单遗憾。我们还建议一种算法,计算干预的成本,利用因果关系侧信息,并在不超出预算的情况下最大限度地减少预期的累积遗憾。我们的累计累累累最小化算法比标准算法要好,而没有考虑到附带信息。最后,我们研究的是没有一般图表中预算限制而学习最佳干预措施的问题,并给出一种算法,在知道每次干预的奖励变量的家长分布时,在实例参数方面实现预期的持续累积遗憾。我们的结果是实验性的,并且与目前文献中最著名的界限相比。

0

相关内容

赌博机/老虎机

赌博机/老虎机

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

专知会员服务

71+阅读 · 2020年3月28日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry

Arxiv

0+阅读 · 2021年2月17日

Deterministic CONGEST Algorithm for MDS on Bounded Arboricity Graphs

Arxiv

0+阅读 · 2021年2月16日

Secure-UCB: Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Secure-UCB: Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Arxiv

0+阅读 · 2021年2月15日

Byzantine Dispersion on Graphs

Arxiv

0+阅读 · 2021年2月15日

Communication-Efficient Distributed Cooperative Learning with Compressed Beliefs

Arxiv

0+阅读 · 2021年2月14日

Achieving Linear Convergence in Federated Learning under Objective and Systems Heterogeneity

Arxiv

0+阅读 · 2021年2月14日

Upper Counterfactual Confidence Bounds: a New Optimism Principle for Contextual Bandits

Arxiv

1+阅读 · 2021年2月12日

Proximal and Federated Random Reshuffling

Arxiv

0+阅读 · 2021年2月12日

The Symmetry between Bandits and Knapsacks: A Primal-Dual LP-based Approach

Arxiv

0+阅读 · 2021年2月12日

Multi-Agent Multi-Armed Bandits with Limited Communication

Arxiv

0+阅读 · 2021年2月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

专知会员服务

71+阅读 · 2020年3月28日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry

Arxiv

0+阅读 · 2021年2月17日

Deterministic CONGEST Algorithm for MDS on Bounded Arboricity Graphs

Arxiv

0+阅读 · 2021年2月16日

Secure-UCB: Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Secure-UCB: Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Arxiv

0+阅读 · 2021年2月15日

Byzantine Dispersion on Graphs

Arxiv

0+阅读 · 2021年2月15日

Communication-Efficient Distributed Cooperative Learning with Compressed Beliefs

Arxiv

0+阅读 · 2021年2月14日

Achieving Linear Convergence in Federated Learning under Objective and Systems Heterogeneity

Arxiv

0+阅读 · 2021年2月14日

Upper Counterfactual Confidence Bounds: a New Optimism Principle for Contextual Bandits

Arxiv

1+阅读 · 2021年2月12日

Proximal and Federated Random Reshuffling

Arxiv

0+阅读 · 2021年2月12日

The Symmetry between Bandits and Knapsacks: A Primal-Dual LP-based Approach

Arxiv

0+阅读 · 2021年2月12日

Multi-Agent Multi-Armed Bandits with Limited Communication

Arxiv

0+阅读 · 2021年2月10日

微信扫码咨询专知VIP会员