Recently, webly supervised learning (WSL) has been studied to leverage numerous and accessible data from the Internet. Most existing methods focus on learning noise-robust models from web images while neglecting the performance drop caused by the differences between web domain and real-world domain. However, only by tackling the performance gap above can we fully exploit the practical value of web datasets. To this end, we propose a Few-shot guided Prototypical (FoPro) representation learning method, which only needs a few labeled examples from reality and can significantly improve the performance in the real-world domain. Specifically, we initialize each class center with few-shot real-world data as the ``realistic" prototype. Then, the intra-class distance between web instances and ``realistic" prototypes is narrowed by contrastive learning. Finally, we measure image-prototype distance with a learnable metric. Prototypes are polished by adjacent high-quality web images and involved in removing distant out-of-distribution samples. In experiments, FoPro is trained on web datasets with a few real-world examples guided and evaluated on real-world datasets. Our method achieves the state-of-the-art performance on three fine-grained datasets and two large-scale datasets. Compared with existing WSL methods under the same few-shot settings, FoPro still excels in real-world generalization. Code is available at https://github.com/yuleiqin/fopro.
翻译:最近,对网络监督的学习进行了研究,以利用互联网上大量可获取的数据。大多数现有方法侧重于从网络图像中学习噪音-紫外线模型,而忽视网络域与现实世界域差异造成的性能下降。然而,只有通过解决上述性能差距,我们才能充分利用网络数据集的实际价值。为此,我们建议采用微弱的引导性Protodom (FoPro) 代表学习方法,该方法只需要从现实中贴上几个标签的例子,就可以大大改善真实世界域的性能。具体地说,我们以“现实”原型的少数真实世界数据初始化了。然后,通过对比性学习缩小了网络实例与“现实世界”原型之间的内部距离。最后,我们用可学习的标准衡量图像-原型的距离。我们用相邻的高品质的网络图像擦亮了模型,并参与删除遥远的发行品样本。在实验中,FoPro在网络数据设置上进行了培训,在现实世界/现实世界数据设置中以少数真实性示例为例,在现实世界数据设置下进行了评估。我们的方法在现实世界范围内的大规模数据运行中以精确的代码进行。