Recently, Miller et al. showed that a model's in-distribution (ID) accuracy has a strong linear correlation with its out-of-distribution (OOD) accuracy on several OOD benchmarks -- a phenomenon they dubbed ''accuracy-on-the-line''. While a useful tool for model selection (i.e., the model most likely to perform the best OOD is the one with highest ID accuracy), this fact does not help estimate the actual OOD performance of models without access to a labeled OOD validation set. In this paper, we show a similar but surprising phenomenon also holds for the agreement between pairs of neural network classifiers: whenever accuracy-on-the-line holds, we observe that the OOD agreement between the predictions of any two pairs of neural networks (with potentially different architectures) also observes a strong linear correlation with their ID agreement. Furthermore, we observe that the slope and bias of OOD vs ID agreement closely matches that of OOD vs ID accuracy. This phenomenon, which we call agreement-on-the-line, has important practical applications: without any labeled data, we can predict the OOD accuracy of classifiers}, since OOD agreement can be estimated with just unlabeled data. Our prediction algorithm outperforms previous methods both in shifts where agreement-on-the-line holds and, surprisingly, when accuracy is not on the line. This phenomenon also provides new insights into deep neural networks: unlike accuracy-on-the-line, agreement-on-the-line appears to only hold for neural network classifiers.
翻译:最近,米勒等人(Miller 等人) 显示,模型的分布(ID) 准确性与其若干 OOD 基准的分布(OOOD) 分配(OOOD) 的准确性有着强烈的线性关系 -- -- 一种被称之为“ 准确性在线”的现象。虽然模型选择的有用工具(即最有可能执行最佳 OOOD 的模型是一个具有最高身份准确性的工具),但这一事实无助于估计模型的实际 OOOD 性能,而没有进入标记的 OOOOD 验证集。在本文中,我们显示一个类似但令人惊讶的现象,也存在于神经网络分类的对对等之间:当线性精确性保持“准确性”时,我们观察到任何两对神经网络的预测(可能有不同的结构)之间的OOOOOOO协议也观察到与其身份协议有很强的线性关系。 此外,我们观察到OOD 和ID ID 协议的偏偏差性与OOOD 的准确性协议非常的精确性, 当我们称之为在线协议时,我们可以对OD 的精确性数据进行预测。