上置信界限论文 - 专知

会员服务 ·

上置信界限

上置信界限

A Frequency-Domain Analysis of the Multi-Armed Bandit Problem: A New Perspective on the Exploration-Exploitation Trade-off

Arxiv

0+阅读 · 10月10日

Q-Learning with Shift-Aware Upper Confidence Bound in Non-Stationary Reinforcement Learning

Arxiv

0+阅读 · 10月3日

Regret Analysis for Randomized Gaussian Process Upper Confidence Bound

Arxiv

0+阅读 · 7月16日

Bayesian Optimization of Robustness Measures under Input Uncertainty: A Randomized Gaussian Process Upper Confidence Bound Approach

Arxiv

0+阅读 · 7月23日

Bayesian Optimization of Robustness Measures Using Randomized GP-UCB-based Algorithms under Input Uncertainty

Arxiv

0+阅读 · 4月4日

Neural Contextual Bandits Under Delayed Feedback Constraints

Arxiv

0+阅读 · 4月16日

Mastering truss structure optimization with tree search

Arxiv

0+阅读 · 4月2日

On Pareto Optimality for the Multinomial Logistic Bandit

Arxiv

0+阅读 · 1月31日

Precise Asymptotics and Refined Regret of Variance-Aware UCB

Arxiv

0+阅读 · 2月16日

Bayesian Optimization by Kernel Regression and Density-based Exploration

Arxiv

0+阅读 · 2月10日

Bandits with Anytime Knapsacks

Arxiv

0+阅读 · 1月30日

Zero-Inflated Bandits

Arxiv

0+阅读 · 1月31日

On the Precise Asymptotics and Refined Regret of the Variance-Aware UCB Algorithm

Arxiv

1+阅读 · 2024年12月12日

Mastering truss structure optimization with tree search

Arxiv

0+阅读 · 2024年11月5日

Tree Ensembles for Contextual Bandits

Arxiv

0+阅读 · 2024年11月1日

参考链接

微信扫码咨询专知VIP会员