补充转移土匪下最佳武器身份查验 (Best Arm Identification under Additive Transfer Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · ARM · 可辨认的 · CASE · 迁移学习 ·

2021 年 12 月 8 日

Best Arm Identification under Additive Transfer Bandits

翻译：补充转移土匪下最佳武器身份查验

Ojash Neopane,Aaditya Ramdas,Aarti Singh

We consider a variant of the best arm identification (BAI) problem in multi-armed bandits (MAB) in which there are two sets of arms (source and target), and the objective is to determine the best target arm while only pulling source arms. In this paper, we study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances. We show how our framework covers a range of previously studied pure exploration problems and additionally captures new problems. We propose and theoretically analyze an LUCB-style algorithm to identify an $\epsilon$-optimal target arm with high probability. Our theoretical analysis highlights aspects of this transfer learning problem that do not arise in the typical BAI setup, and yet recover the LUCB algorithm for single domain BAI as a special case.

翻译：我们考虑的是多武装匪徒中最佳武器识别(BAI)问题的一个变式,即有两套武器(来源和目标),目标是确定最佳目标武器,同时只拉出源武器。在本文中,我们研究的是尽管手段不明,但来源和目标武器识别(BAI)案例之间何时存在着已知的叠加关系。我们展示了我们的框架如何涵盖以前研究过的一系列纯勘探问题,并额外捕捉了新的问题。我们提议并理论上分析一种LUCB式算法,以便极有可能确定一个$-epsilon$-最佳目标武器。我们的理论分析强调了在典型BAI设置中并不出现的转让学习问题的各个方面,但作为一个特例,我们又恢复了LUCB用于单一域的BAI算法。

0

相关内容

赌博机/老虎机

赌博机/老虎机

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Efficient Kernel UCB for Contextual Bandits

Arxiv

0+阅读 · 2022年2月11日

Strong core and Pareto-optimal solutions for the multiple partners matching problem under lexicographic preferences

Arxiv

0+阅读 · 2022年2月11日

Remote Contextual Bandits

Remote Contextual Bandits

Arxiv

0+阅读 · 2022年2月10日

Transfer-Learning Across Datasets with Different Input Dimensions: An Algorithm and Analysis for the Linear Regression Case

Transfer-Learning Across Datasets with Different Input Dimensions: An Algorithm and Analysis for the Linear Regression Case

Arxiv

0+阅读 · 2022年2月10日

Best Arm Identification with a Fixed Budget under a Small Gap

Best Arm Identification with a Fixed Budget under a Small Gap

Arxiv

0+阅读 · 2022年2月10日

Learning in Restless Bandits under Exogenous Global Markov Process

Learning in Restless Bandits under Exogenous Global Markov Process

Arxiv

0+阅读 · 2022年2月10日

Adapting to Mixing Time in Stochastic Optimization with Markovian Data

Arxiv

0+阅读 · 2022年2月9日

A Theory of Label Propagation for Subpopulation Shift

Arxiv

7+阅读 · 2021年2月22日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Efficient Kernel UCB for Contextual Bandits

Arxiv

0+阅读 · 2022年2月11日

Strong core and Pareto-optimal solutions for the multiple partners matching problem under lexicographic preferences

Arxiv

0+阅读 · 2022年2月11日

Remote Contextual Bandits

Remote Contextual Bandits

Arxiv

0+阅读 · 2022年2月10日

Transfer-Learning Across Datasets with Different Input Dimensions: An Algorithm and Analysis for the Linear Regression Case

Transfer-Learning Across Datasets with Different Input Dimensions: An Algorithm and Analysis for the Linear Regression Case

Arxiv

0+阅读 · 2022年2月10日

Best Arm Identification with a Fixed Budget under a Small Gap

Best Arm Identification with a Fixed Budget under a Small Gap

Arxiv

0+阅读 · 2022年2月10日

Learning in Restless Bandits under Exogenous Global Markov Process

Learning in Restless Bandits under Exogenous Global Markov Process

Arxiv

0+阅读 · 2022年2月10日

Adapting to Mixing Time in Stochastic Optimization with Markovian Data

Arxiv

0+阅读 · 2022年2月9日

A Theory of Label Propagation for Subpopulation Shift

Arxiv

7+阅读 · 2021年2月22日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

微信扫码咨询专知VIP会员