Online optimization is a well-established optimization paradigm that aims to make a sequence of correct decisions given knowledge of the correct answer to previous decision tasks. Bilevel programming involves a hierarchical optimization problem where the feasible region of the so-called outer problem is restricted by the graph of the solution set mapping of the inner problem. This paper brings these two ideas together and studies an online bilevel optimization setting in which a sequence of time-varying bilevel problems are revealed one after the other. We extend the known regret bounds for single-level online algorithms to the bilevel setting. Specifically, we introduce new notions of bilevel regret, develop an online alternating time-averaged gradient method that is capable of leveraging smoothness, and provide regret bounds in terms of the path-length of the inner and outer minimizer sequences.
翻译:在线优化是一个既定的优化模式,目的是根据对先前决策任务正确答案的了解,做出一系列正确的决定。双级编程涉及一个等级优化问题,因为所谓外部问题的可行区域受到内部问题解决方案图图的限制。本文将这两个想法汇集在一起,研究一个在线双级优化环境,在这种环境中,一个接一个地披露时间相移的双级问题。我们把已知单级在线算法的遗憾界限扩大到双级设置。具体地说,我们引入了双级遗憾的新概念,开发了一种在线交替时间平均梯度方法,该方法能够利用内部和外部最小化序列的平滑性,并提供内向和外部最小化序列路径长度的遗憾界限。