While deep neural networks show great performance on fitting to the training distribution, improving the networks' generalization performance to the test distribution and robustness to the sensitivity to input perturbations still remain as a challenge. Although a number of mixup based augmentation strategies have been proposed to partially address them, it remains unclear as to how to best utilize the supervisory signal within each input data for mixup from the optimization perspective. We propose a new perspective on batch mixup and formulate the optimal construction of a batch of mixup data maximizing the data saliency measure of each individual mixup data and encouraging the supermodular diversity among the constructed mixup data. This leads to a novel discrete optimization problem minimizing the difference between submodular functions. We also propose an efficient modular approximation based iterative submodular minimization algorithm for efficient mixup computation per each minibatch suitable for minibatch based neural network training. Our experiments show the proposed method achieves the state of the art generalization, calibration, and weakly supervised localization results compared to other mixup methods. The source code is available at https://github.com/snu-mllab/Co-Mixup.
翻译:虽然深神经网络在适应培训分布方面表现良好,但提高网络对测试分布的通用性性能和对输入扰动敏感度的稳健性仍是一项挑战,虽然提出了若干基于混合的增强战略,以部分解决这些问题,但对于如何最好地利用每个输入数据中的监督信号,以便从优化角度进行混和,仍不清楚如何最好地利用每个输入数据中的监督信号。我们提出了关于批量混和的新观点,并制定了一组混合数据的最佳构建方法,以尽量扩大每个组合数据的数据显著度量度,并鼓励在构建的混和数据中实现超级模块多样性。这导致了一个新的离散优化问题,最大限度地缩小了子模块功能之间的差异。我们还提出了基于模块的基于组合的迭代式子模块最小化算法,以高效混合计算适合小型包基于神经网络的培训的每个微型批量。我们提出的方法可以实现艺术一般化状态、校准和与其他混合方法相比受到薄弱监督的本地化结果。源代码见https://github.com/snu-mllaba/Co-Mupix。