We show that the Subgradient algorithm is universal for online learning on the simplex in the sense that it simultaneously achieves $O(\sqrt N)$ regret for adversarial costs and $O(1)$ pseudo-regret for i.i.d costs. To the best of our knowledge this is the first demonstration of a universal algorithm on the simplex that is not a variant of Hedge. Since Subgradient is a popular and widely used algorithm our results have immediate broad application.
翻译:我们显示,子梯度算法是通用的,用于在简单x上进行在线学习,因为它同时为对抗性成本和i.d.d.成本分别获得O(sqrt N)美元和1美元(1美元)假正值的遗憾。 据我们所知,这是首次在简单x上展示通用算法,而这不是套头的变种。由于子梯度是一种流行和广泛使用的算法,因此我们的结果可以立即广泛应用。