Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augmentations may not achieve the desired counterfactual-invariance if the augmentation is performed by a context-guessing machine, an abstract machine that guesses the most-likely context of a given input. We theoretically analyze the invariance imposed by such counterfactual data augmentations and describe an exemplar NLP task where counterfactual data augmentation by a context-guessing machine does not lead to robust OOD classifiers.
翻译:深层学习模型往往不会失去分配能力,主要是因为它们依赖虚假特征来完成任务。 反事实数据增强提供了一种一般性的表达方式(约)实现与虚假特征相对的反实际变异,这是分配(OOOD)强力的要求。 在这项工作中,我们表明,反事实数据增强如果由上下文猜测机器进行增量,则可能无法实现预期的反事实变异,而后者是一个抽象的机器,可以猜测某项输入的最可能的背景。 我们从理论上分析了此类反事实数据增量造成的变异,并描述了一个超现实 NLP任务,即由上下文猜测的机器进行的反事实数据增强不会导致稳健的 OOD 分类器。