Robust training methods against perturbations to the input data have received great attention in the machine learning literature. A standard approach in this direction is adversarial training which learns a model using adversarially-perturbed training samples. However, adversarial training performs suboptimally against perturbations structured across samples such as universal and group-sparse shifts that are commonly present in biological data such as gene expression levels of different tissues. In this work, we seek to close this optimality gap and introduce Group-Structured Adversarial Training (GSAT) which learns a model robust to perturbations structured across samples. We formulate GSAT as a non-convex concave minimax optimization problem which minimizes a group-structured optimal transport cost. Specifically, we focus on the applications of GSAT for group-sparse and rank-constrained perturbations modeled using group and nuclear norm penalties. In order to solve GSAT's non-smooth optimization problem in those cases, we propose a new minimax optimization algorithm called GDADMM by combining Gradient Descent Ascent (GDA) and Alternating Direction Method of Multipliers (ADMM). We present several applications of the GSAT framework to gain robustness against structured perturbations for image recognition and computational biology datasets.
翻译:在机器学习文献中,针对输入数据的扰动的强力培训方法受到高度重视。这方面的标准做法是对抗性培训,学习使用对抗性扰动训练样本的模式;然而,对抗性培训对各种样品结构的扰动,例如不同组织基因表达水平等生物数据中常见的普遍和群体偏差变化,都具有相对性。在这项工作中,我们力求缩小这种最佳性差距,并引入集团结构化的反向培训(GSAT),以学习一种对各样本间扰动结构的强力模型。我们将GSAT设计成一种非cavex concave小型最大优化问题,最大限度地降低群体结构化最佳运输成本。具体地说,我们侧重于GSAT应用于群体偏差和级别约束性扰动模型,例如不同组织的基因表现和核规范处罚。为了解决这些情况下GSAT的非移动式优化问题,我们提议一种新的微缩缩缩缩缩缩算算法,称为GDADMMM, 将GADMM作为非电动模型的组合, 将GAGAMAD 和GARIent Sqrental Adal 用于目前稳健的数学模型。