In recent years, gradient-based methods for solving bi-level optimization tasks have drawn a great deal of interest from the machine learning community. However, to calculate the gradient of the best response, existing research always relies on the singleton of the lower-level solution set (a.k.a., Lower-Level Singleton, LLS). In this work, by formulating bi-level models from an optimistic bi-level viewpoint, we first establish a novel Bi-level Descent Aggregation (BDA) framework, which aggregates hierarchical objectives of both upper level and lower level. The flexibility of our framework benefits from the embedded replaceable task-tailored iteration dynamics modules, thereby capturing a wide range of bi-level learning tasks. Theoretically, we derive a new methodology to prove the convergence of BDA framework without the LLS restriction. Besides, the new proof recipe we propose is also engaged to improve the convergence results of conventional gradient-based bi-level methods under the LLS simplification. Furthermore, we employ a one-stage technique to accelerate the back-propagation calculation in a numerical manner. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks.
翻译:近年来,解决双级优化任务的梯度方法吸引了机器学习界的极大兴趣。然而,为了计算最佳反应的梯度,现有研究总是依靠较低层次解决方案集(a.k.a.a.a.,低级别Slistton,LLS)的单吨。在这项工作中,通过从乐观的双级观点制定双级模型,我们首先建立了一个新型的双层非洲人后裔集合框架,将上层和下层的等级目标汇总在一起。我们框架的灵活性得益于嵌入式的可替换任务定制的迭代动态模块,从而捕捉了广泛的双级学习任务。理论上,我们提出了一种新的方法,证明BDA框架的趋同而没有LLS限制。此外,我们提议的新的证据配方还致力于改进LSS简化下传统的基于梯度的双级方法的趋同结果。此外,我们采用了一种一阶段技术,以数字方式加速后调计算。从广义的实验中可以证明我们的理论结果,并展示拟议的超模版模型学习任务的优性。