We are interested in learning robust models from insufficient data, without the need for any externally pre-trained checkpoints. First, compared to sufficient data, we show why insufficient data renders the model more easily biased to the limited training environments that are usually different from testing. For example, if all the training swan samples are "white", the model may wrongly use the "white" environment to represent the intrinsic class swan. Then, we justify that equivariance inductive bias can retain the class feature while invariance inductive bias can remove the environmental feature, leaving the class feature that generalizes to any environmental changes in testing. To impose them on learning, for equivariance, we demonstrate that any off-the-shelf contrastive-based self-supervised feature learning method can be deployed; for invariance, we propose a class-wise invariant risk minimization (IRM) that efficiently tackles the challenge of missing environmental annotation in conventional IRM. State-of-the-art experimental results on real-world benchmarks (VIPriors, ImageNet100 and NICO) validate the great potential of equivariance and invariance in data-efficient learning. The code is available at https://github.com/Wangt-CN/EqInv
翻译:我们有兴趣从不足的数据中学习强健的模型,而不需要任何经过外部培训的检查站。首先,与充分的数据相比,我们展示了为什么数据不足使模型更容易地偏向通常不同于测试的有限培训环境。例如,如果所有培训天鹅样本都是“白色”的,那么模型可能会错误地使用“白色”环境来代表内在阶级天鹅。然后,我们证明,不均匀的感应偏差可以保留阶级特征,而不偏向诱导偏差可以消除环境特征,而让班级特征在测试中一般地反映环境变化。为了等差,我们将模型更容易偏向于通常不同于测试的有限培训环境环境环境环境环境环境环境环境。为了求同,我们证明可以使用任何现成的、以对比为基础的自我监督特征学习方法;为了不均匀,我们建议一种等级的不均匀风险最小化(IRM),以有效解决传统IRM中缺少环境注释的挑战。州-艺术实验结果可以消除环境特征特征特征特征,让测试中的任何环境变化特征特征。为了等差异,我们证明在测试中有很大的潜力。