Gradient methods have become mainstream techniques for Bi-Level Optimization (BLO) in learning fields. The validity of existing works heavily rely on either a restrictive Lower- Level Strong Convexity (LLSC) condition or on solving a series of approximation subproblems with high accuracy or both. In this work, by averaging the upper and lower level objectives, we propose a single loop Bi-level Averaged Method of Multipliers (sl-BAMM) for BLO that is simple yet efficient for large-scale BLO and gets rid of the limited LLSC restriction. We further provide non-asymptotic convergence analysis of sl-BAMM towards KKT stationary points, and the comparative advantage of our analysis lies in the absence of strong gradient boundedness assumption, which is always required by others. Thus our theory safely captures a wider variety of applications in deep learning, especially where the upper-level objective is quadratic w.r.t. the lower-level variable. Experimental results demonstrate the superiority of our method.
翻译:在这项工作中,通过平均水平目标,我们为BLO提出了一个单一循环双层乘数平均法(sl-BAMM),该方法对于大型BLO来说既简单又有效,并消除了有限限制的LLSC。我们进一步对Sl-BAM向KKT固定点的非同步趋同性分析,我们分析的相对优势在于没有强烈的梯度界限假设,而其他人总是需要这种假设。因此,我们的理论在深层次学习中安全地捕捉了更广泛的应用,特别是在高层次目标是四级变数的情况下。实验结果显示了我们方法的优越性。