In this paper, we propose a practical online method for solving a distributionally robust optimization (DRO) for deep learning, which has important applications in machine learning for improving the robustness of neural networks. In the literature, most methods for solving DRO are based on stochastic primal-dual methods. However, primal-dual methods for deep DRO suffer from several drawbacks: (1) manipulating a high-dimensional dual variable corresponding to the size of data is time expensive; (2) they are not friendly to online learning where data is coming sequentially. To address these issues, we transform the min-max formulation into a minimization formulation and propose a practical duality-free online stochastic method for solving deep DRO with KL divergence regularization. The proposed online stochastic method resembles the practical stochastic Nesterov's method in several perspectives that are widely used for learning deep neural networks. Under a Polyak-Lojasiewicz (PL) condition, we prove that the proposed method can enjoy an optimal sample complexity and a better round complexity (the number of gradient evaluations divided by a fixed mini-batch size) with a moderate mini-batch size than existing algorithms for solving the min-max or min formulation of DRO. Of independent interest, the proposed method can be also used for solving a family of stochastic compositional problems.
翻译:在本文中,我们提出一个实用的在线方法,用于解决用于深层学习的分布式强力优化(DRO),该方法在机器学习中具有提高神经网络稳健性的重要应用。在文献中,解决DRO的大多数方法都是基于随机性原始-双向方法。然而,深层DRO的原始-双向方法有几个缺点:(1)操纵一个与数据大小相对应的高维双变量是昂贵的;(2)它们不利于数据按顺序生成的在线学习。为解决这些问题,我们将微模模轴配方转换成一个最小化的配方,并提出一种实用的无双重性在线透析方法,用KLL的偏差规范化方法解决深度的DRO。拟议的在线透析方法类似于用于学习深层神经网络的实用性双向方法。在Polyak-Lojasiewicz(PL)条件下,我们证明拟议的方法可以具有最优化的样本复杂性和更圆的复杂度(由固定的MI-RACS型和最小的软型配置方法所拟的缩缩缩缩缩缩成的缩成型的缩缩式评估数量)。