Despite -- or maybe because of -- their astonishing capacity to fit data, neural networks are believed to have difficulties extrapolating beyond training data distribution. This work shows that, for extrapolations based on finite transformation groups, a model's inability to extrapolate is unrelated to its capacity. Rather, the shortcoming is inherited from a learning hypothesis: Examples not explicitly observed with infinitely many training examples have underspecified outcomes in the learner's model. In order to endow neural networks with the ability to extrapolate over group transformations, we introduce a learning framework counterfactually-guided by the learning hypothesis that any group invariance to (known) transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data. Unlike existing invariance-driven methods for (counterfactual) extrapolations, this framework allows extrapolations from a single environment. Finally, we introduce sequence and image extrapolation tasks that validate our framework and showcase the shortcomings of traditional approaches.
翻译:尽管 -- -- 或者可能是因为 -- -- 神经网络在适应数据方面的能力惊人,但据认为,除了培训数据分布之外,神经网络还难以推断出培训数据。这项工作表明,对于基于有限变换组的外推法,模型无法外推与其能力无关。相反,缺点是从学习假设中继承而来:以无限多的培训实例未明确观察到的例子在学习者模型中未说明结果。为了缩小神经网络,使其有能力对群体变换进行外推,我们引入了一个学习假设,即任何(已知的)变换组即使没有证据也是强制性的,除非学习者认为它与培训数据不一致。与现有的(反事实)外推法不同,这个框架允许从单一环境中推断外推。最后,我们引入了序列和图像外推法任务,以验证我们的框架并展示传统方法的缺点。