We tackle the problem of online optimization with a general, possibly unbounded, loss function. It is well known that when the loss is bounded, the exponentially weighted aggregation strategy (EWA) leads to a regret in $\sqrt{T}$ after $T$ steps. In this paper, we study a generalized aggregation strategy, where the weights no longer depend exponentially on the losses. Our strategy is based on Follow The Regularized Leader (FTRL): we minimize the expected losses plus a regularizer, that is here a $\phi$-divergence. When the regularizer is the Kullback-Leibler divergence, we obtain EWA as a special case. Using alternative divergences enables unbounded losses, at the cost of a worst regret bound in some cases.
翻译:我们用一般的、可能没有限制的损失函数来解决在线优化问题。 众所周知, 当损失被捆绑起来时, 指数加权总和战略(EWA)导致在$T步骤之后以$$$(sqrt{T)为单位的遗憾。 在本文中, 我们研究一个通用的总和战略, 其中权重不再以损失为单位的指数。 我们的战略基于“ 正规化领导人 ” ( FTRL ) : 我们最大限度地减少预期的损失, 加上一个常规化的, 也就是一个$$-phe$- diverence 。 当正规化器是“ 库尔贝克- 利伯尔差异 ” ( Kullback- Leiber differ) 时, 我们得到了 EWA 的特例 。 使用替代的偏差使得损失无限制, 在某些情况下, 以最遗憾的代价为代价。