We revisit the concept of "adversary" in online learning, motivated by solving robust optimization and adversarial training using online learning methods. While one of the classical setups in online learning deals with the "adversarial" setup, it appears that this concept is used less rigorously, causing confusion in applying results and insights from online learning. Specifically, there are two fundamentally different types of adversaries, depending on whether the "adversary" is able to anticipate the exogenous randomness of the online learning algorithms. This is particularly relevant to robust optimization and adversarial training because the adversarial sequences are often anticipative, and many online learning algorithms do not achieve diminishing regret in such a case. We then apply this to solving robust optimization problems or (equivalently) adversarial training problems via online learning and establish a general approach for a large variety of problem classes using imaginary play. Here two players play against each other, the primal player playing the decisions and the dual player playing realizations of uncertain data. When the game terminates, the primal player has obtained an approximately robust solution. This meta-game allows for solving a large variety of robust optimization and multi-objective optimization problems and generalizes the approach of arXiv:1402.6361.
翻译:我们重新审视了在线学习中的“反向”概念,其动机是用在线学习方法解决稳健优化和对抗性培训。 在线学习中的经典设置之一与“ 对抗性” 设置相比,似乎这一概念的使用不太严格,在应用结果和在线学习的见解方面造成了混乱。 具体地说,有两种截然不同的对手类型,这取决于“ 反向” 是否能够预测在线学习算法的外源随机性。 这与稳健优化和对抗性培训特别相关,因为对抗性序列往往具有预测性,而许多在线学习算法在这类情况下并没有减少遗憾。 然后我们运用这一概念,通过在线学习解决稳健的优化问题或(同等的)对抗性培训问题,并用想象的游戏为大量问题班制定一般方法。 有两个玩家在这里相互竞争, 玩决定的原始玩家和玩家在玩不确定数据。 当游戏结束时, 原始玩家已经获得了一个大致稳健的解决方案。 这种元游戏能够解决大量稳健的优化和多目标优化问题, X 以及一般化方法。