In this paper, we propose a practical online method for solving a distributionally robust optimization (DRO) for deep learning, which has important applications in machine learning for improving the robustness of neural networks. In the literature, most methods for solving DRO are based on stochastic primal-dual methods. However, primal-dual methods for deep DRO suffer from several drawbacks: (1) manipulating a high-dimensional dual variable corresponding to the size of data is time expensive; (2) they are not friendly to online learning where data is coming sequentially. To address these issues, we transform the min-max formulation into a minimization formulation and propose a practical duality-free online stochastic method for solving deep DRO with KL divergence regularization. The proposed online stochastic method resembles the practical stochastic Nesterovs method in several perspectives that are widely used for learning deep neural networks. Under a Polyak-Lojasiewicz (PL) condition, we prove that the proposed method can enjoy an optimal sample complexity without any requirements on large batch size. Of independent interest, the proposed method can be also used for solving a family of stochastic compositional problems.
翻译:在本文中,我们提出一个实用的在线方法,解决分配强力优化(DRO),用于深层学习,这在机器学习中具有重要应用,以提高神经网络的稳健性。在文献中,解决DRO的大多数方法都是基于随机性原始-双向方法。然而,深层DRO的原始-双向方法有几个缺点:(1) 操纵一个与数据大小相对应的高维双变量是昂贵的;(2) 这些数据对数据按顺序生成的在线学习不友好。为了解决这些问题,我们将微麦配方转化为一个最小化的配方,并提出一种实用的无双重性在线随机方法,用KL差异规范解决深度DRO。提议的在线随机方法类似于几个角度的实用随机神经网络方法。在聚氨基-Lojasiewicz(PL)条件下,我们证明拟议的方法可以在没有大批量要求的情况下享有最佳的样本复杂性。出于独立的兴趣,拟议的方法也可以用于解决家庭差异问题。