Models can fail in unpredictable ways during deployment due to task ambiguity, when multiple behaviors are consistent with the provided training data. An example is an object classifier trained on red squares and blue circles: when encountering blue squares, the intended behavior is undefined. We investigate whether pretrained models are better active learners, capable of disambiguating between the possible tasks a user may be trying to specify. Intriguingly, we find that better active learning is an emergent property of the pretraining process: pretrained models require up to 5 times fewer labels when using uncertainty-based active learning, while non-pretrained models see no or even negative benefit. We find these gains come from an ability to select examples with attributes that disambiguate the intended behavior, such as rare product categories or atypical backgrounds. These attributes are far more linearly separable in pretrained model's representation spaces vs non-pretrained models, suggesting a possible mechanism for this behavior.
翻译:在部署期间,由于任务含糊不清,当多重行为与所提供的培训数据一致时,模型可能会以无法预测的方式失败。例如,一个在红色正方形和蓝色圆圈上受训的对象分类师:遇到蓝色正方形时,预期的行为是没有定义的。我们调查的是,预先训练的模型是否是较活跃的学习者,能够对用户可能要确定的任务进行分辨。有趣的是,我们发现,更积极的学习是培训前过程的新兴特性:在使用基于不确定性的积极学习时,预先训练的模型需要比以前少5倍的标签,而非预先训练的模型则看不到任何或甚至消极的好处。我们发现,这些收益来自于能够选择具有模糊预期行为属性的例子,例如稀有的产品类别或非典型背景。这些属性在预先训练的模式代表空间与非受训练的模式相比,更具有线性分解性,为这种行为提供了一种可能的机制。