In Self-Supervised Learning (SSL), it is known that frequent occurrences of the collision in which target data and its negative samples share the same class can decrease performance. Especially in real-world data such as crawled data or robot-gathered observations, collisions may occur more often due to the duplicates in the data. To deal with this problem, we claim that sampling negative samples from the adaptively debiased distribution in the memory makes the model more stable than sampling from a biased dataset directly. In this paper, we introduce a novel SSL framework with adaptive Duplicate Elimination (DUEL) inspired by the human working memory. The proposed framework successfully prevents the downstream task performance from degradation due to a dramatic inter-class imbalance.
翻译:在自我监督学习(SSL)中,众所周知,经常发生目标数据及其负面样本与目标数据同属一类的碰撞会降低性能,特别是在实际世界数据(如爬行数据或机器人采集的观测)中,碰撞可能更经常地发生,因为数据重复。为了解决这个问题,我们声称,从记忆中适应性偏差分布的负面抽样抽样抽样比直接从偏差数据集取样的模型更稳定。在本文中,我们引入了一个新的SSL框架,由人类工作记忆启发的适应性重复消除(DUEL ) 。拟议的框架成功地防止了下游任务性能的退化,因为阶级之间的不平衡。