Despite the increasing scale of datasets in machine learning, generalization to unseen regions of the data distribution remains crucial. Such extrapolation is by definition underdetermined and is dictated by a learner's inductive biases. Machine learning systems often do not share the same inductive biases as humans and, as a result, extrapolate in ways that are inconsistent with our expectations. We investigate two distinct such inductive biases: feature-level bias (differences in which features are more readily learned) and exemplar-vs-rule bias (differences in how these learned features are used for generalization). Exemplar- vs. rule-based generalization has been studied extensively in cognitive psychology, and, in this work, we present a protocol inspired by these experimental approaches for directly probing this trade-off in learning systems. The measures we propose characterize changes in extrapolation behavior when feature coverage is manipulated in a combinatorial setting. We present empirical results across a range of models and across both expository and real-world image and language domains. We demonstrate that measuring the exemplar-rule trade-off while controlling for feature-level bias provides a more complete picture of extrapolation behavior than existing formalisms. We find that most standard neural network models have a propensity towards exemplar-based extrapolation and discuss the implications of these findings for research on data augmentation, fairness, and systematic generalization.
翻译:尽管机器学习中的数据集规模不断扩大,但数据分布的普及到看不见的区域仍然至关重要。这种外推法根据定义定得过低,受学习者的感性偏差决定。机器学习系统往往与人类不具有相同的感性偏见,因此,以不符合我们期望的方式推断出。我们调查了两种截然不同的诱导偏差:特征级偏差(特征更易于学习的不同)和特例规则偏差(这些所学特征如何用于普遍性的差异)。在认知心理学中广泛研究了基于规则的泛泛化。在这项工作中,我们提出了一个受这些实验方法启发的规程,以直接促成学习系统中的这种交易。我们建议的措施在特征覆盖在组合环境中被操纵时对外推行为进行定性。我们介绍了各种模型、实例和现实世界图像和语言领域的实证结果。我们证明,衡量最先入为主的内置式研究结果,比目前标准的网络外推法则更能控制目前标准模型的外推式模型。我们展示了目前对超级级性模型的超度分析,而我们又发现,我们发现,在目前对超级级级级级的超级分析性模型中较彻底的外推式模型中,对级分析提供了一种超级分析。