Consider an online convex optimization problem where the loss functions are self-concordant barriers, smooth relative to a convex function $h$, and possibly non-Lipschitz. We analyze the regret of online mirror descent with $h$. Then, based on the result, we prove the following in a unified manner. Denote by $T$ the time horizon and $d$ the parameter dimension. 1. For online portfolio selection, the regret of $\widetilde{\text{EG}}$, a variant of exponentiated gradient due to Helmbold et al., is $\tilde{O} ( T^{2/3} d^{1/3} )$ when $T > 4 d / \log d$. This improves on the original $\tilde{O} ( T^{3/4} d^{1/2} )$ regret bound for $\widetilde{\text{EG}}$. 2. For online portfolio selection, the regret of online mirror descent with the logarithmic barrier is $\tilde{O}(\sqrt{T d})$. The regret bound is the same as that of Soft-Bayes due to Orseau et al. up to logarithmic terms. 3. For online learning quantum states with the logarithmic loss, the regret of online mirror descent with the log-determinant function is also $\tilde{O} ( \sqrt{T d} )$. Its per-iteration time is shorter than all existing algorithms we know.
翻译:如果损失函数是自定义障碍, 相对于 convex 函数来说是平滑的 $h 美元, 也可能是非利普西茨 。 我们分析以美元表示的线上镜底下降的遗憾。 然后, 根据结果, 我们以统一的方式证明以下。 注意时间范围 $T 和 参数维度 。 1. 在线投资组合选择, 美元全局text{ eg_ $ 的遗憾, 由 Helmbold 等人 推出的梯度的变式是 $\ tilde{ O} (T ⁇ 2/3} d\ 3/3} 美元) 。 当 $ > 4 d/ log d$ 。 这在原始 $\ t\ t$ t$ 时间范围 和 参数维度 。 对于在线投资组合选择来说, 在线镜底值的遗憾是 $\ filder deqrial_ relieflex recolation 。