理解从数据角度分析模型变化的普遍化效益 (Understanding the Generalization Benefit of Model Invariance from a Data Perspective)

Machine learning models that are developed to be invariant under certain types of data transformations have shown improved generalization in practice. However, a principled understanding of why invariance benefits generalization is limited. Given a dataset, there is often no principled way to select "suitable" data transformations under which model invariance guarantees better generalization. This paper studies the generalization benefit of model invariance by introducing the sample cover induced by transformations, i.e., a representative subset of a dataset that can approximately recover the whole dataset using transformations. For any data transformations, we provide refined generalization bounds for invariant models based on the sample cover. We also characterize the "suitability" of a set of data transformations by the sample covering number induced by transformations, i.e., the smallest size of its induced sample covers. We show that we may tighten the generalization bounds for "suitable" transformations that have a small sample covering number. In addition, our proposed sample covering number can be empirically evaluated and thus provides a guide for selecting transformations to develop model invariance for better generalization. In experiments on multiple datasets, we evaluate sample covering numbers for some commonly used transformations and show that the smaller sample covering number for a set of transformations (e.g., the 3D-view transformation) indicates a smaller gap between the test and training error for invariant models, which verifies our propositions.

翻译：在某些类型的数据变换中开发的机床学习模型是无差异的,这些模型在某类数据变换中被开发为无差异的,这些模型在实践中显示出了更好的概括性。然而,对于为什么“无差异”会给一般化带来有限的好处,有原则地理解为什么是有限的。鉴于数据集,通常没有原则性的方法来选择“适合的”数据变换,在这种变换中,模型的变换保证了更好的概括性。本文研究模型变换的概括性好处,通过引入由变换(即:一个具有代表性的数据集集集,可以使用变换来大致恢复整个数据集。对于任何数据变换,我们为基于抽样覆盖的变换模型提供了经过改进的一般化界限。我们用样本来描述一套数据变换的“是否适合性”,也就是说,它最小的缩放范围。我们为小的变换的变换,我们提议的包括数字的抽样可以进行实证评估,从而提供一个指南,用来选择一些变换模式,以发展变换的模型,用于更精确的变换。在多个的变换中,我们用来进行试验的变换中要显示一个比较的抽样。