In this note, we give a new lower bound for the $\gamma$-regret in bandit problems, the regret which arises when comparing against a benchmark that is $\gamma$ times the optimal solution, i.e., $\mathsf{Reg}_{\gamma}(T) = \sum_{t = 1}^T \gamma \max_{\pi} f(\pi) - f(\pi_t)$. The $\gamma$-regret arises in structured bandit problems where finding an exact optimum of $f$ is intractable. Our lower bound is given in terms of a modification of the constrained Decision-Estimation Coefficient (DEC) of~\citet{foster2023tight} (and closely related to the original offset DEC of \citet{foster2021statistical}), which we term the $\gamma$-DEC. When restricted to the traditional regret setting where $\gamma = 1$, our result removes the logarithmic factors in the lower bound of \citet{foster2023tight}.
翻译:在本说明中,我们给土匪问题中的$gamma$-regret提供了一个新的较低约束值。 美元gamma$- regret 出现在结构化的土匪问题中, 找到精确最佳的美元是难以解决的。 我们的较低约束值是修改限制的决定- 估计系数(DEC) 的“citet{foster2023tight}”(与我们称之为$gama$2021statistical}的原始抵消DEC密切相关) 。 当我们限制在$\gamma$=1美元的传统悔恨状态中时, 我们的结果消除了低约束范围(DEC) 的对数系数 。 {citet{foster2023t} (与我们称之为$gamma$2021statistical} 的原始抵消值DEC密切相关) 。 当我们限制在$\gamma =1美元20tstestrate 设置时, 我们的结果消除了下层的对数系数 。</s>