Designing learning systems which are invariant to certain data transformations is critical in machine learning. Practitioners can typically enforce a desired invariance on the trained model through the choice of a network architecture, e.g. using convolutions for translations, or using data augmentation. Yet, enforcing true invariance in the network can be difficult, and data invariances are not always known a piori. State-of-the-art methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems, which are complex to solve and often computationally demanding. In this work we investigate new ways of learning invariances only from the training data. Using learnable augmentation layers built directly in the network, we demonstrate that our method is very versatile. It can incorporate any type of differentiable augmentation and be applied to a broad class of learning problems beyond computer vision. We provide empirical evidence showing that our approach is easier and faster to train than modern automatic data augmentation techniques based on bilevel optimization, while achieving comparable results. Experiments show that while the invariances transferred to a model through automatic data augmentation are limited by the model expressivity, the invariance yielded by our approach is insensitive to it by design.
翻译:设计有助于某些数据转换的学习系统在机器学习中至关重要。 实践者通常可以通过选择网络结构,例如使用变异翻译或使用数据扩增,对经过培训的模型实施预期的变异。 然而,在网络中实施真正的变异可能是困难的,数据变异并不总是已知的。 学习数据扩增政策的最先进方法需要搁置数据,并基于双级优化问题,这些问题复杂难以解决,而且往往在计算上要求。 在这项工作中,我们只调查从培训数据中学习的变异的新方式。我们使用直接建在网络中的可学习增异层,我们证明我们的方法非常灵活。它可以将任何类型的可变异增强纳入网络,并应用于计算机视野以外的广泛的学习问题。我们提供了经验证据,表明我们的方法比基于双级优化的现代自动数据扩增技术培训容易和更快,同时取得可比的结果。 实验表明,虽然通过自动数据扩增的变异性通过自动数据转换到模型,但受模型设计敏感性的限制。