The superior performance of some of today's state-of-the-art deep learning models is to some extent owed to extensive (self-)supervised contrastive pretraining on large-scale datasets. In contrastive learning, the network is presented with pairs of positive (similar) and negative (dissimilar) datapoints and is trained to find an embedding vector for each datapoint, i.e., a representation, which can be further fine-tuned for various downstream tasks. In order to safely deploy these models in critical decision-making systems, it is crucial to equip them with a measure of their uncertainty or reliability. However, due to the pairwise nature of training a contrastive model, and the lack of absolute labels on the output (an abstract embedding vector), adapting conventional uncertainty estimation techniques to such models is non-trivial. In this work, we study whether the uncertainty of such a representation can be quantified for a single datapoint in a meaningful way. In other words, we explore if the downstream performance on a given datapoint is predictable, directly from its pre-trained embedding. We show that this goal can be achieved by directly estimating the distribution of the training data in the embedding space and accounting for the local consistency of the representations. Our experiments show that this notion of uncertainty for an embedding vector often strongly correlates with its downstream accuracy.
翻译:当今一些最先进的深层学习模型的优异性表现在某种程度上归功于对大型数据集的广泛(自我)监督的对比性前期培训。在对比性学习中,向网络展示了正(相似)和负(不同)数据点,并训练如何为每个数据点找到嵌入矢量,即代表,可以进一步微调,以适应各种下游任务。为了在关键决策系统中安全地部署这些模型,至关重要的是为这些模型提供一定程度的不确定性或可靠性。然而,由于培训一个对比性模型的对等性质,以及产出(抽象嵌入矢量)缺乏绝对的标签,因此,将常规不确定性估算技术改用这种模型是非边际的。在这项工作中,我们研究是否可以用有意义的方式将这种代表的不确定性量化为单一数据点。换句话说,我们探讨一个特定数据点的下游表现是否可预测,直接从其事先培训嵌入开始。我们表明,这个目标可以通过直接地定位模型来实现。我们通过直接地估算空间数据配置的准确性,从而显示其数据在本地数据中的准确性。我们经常通过直接的模型来显示其真实性。