Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images. However, these applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions. In recent years, many studies have tackled this problem. In contrast, relatively few works have considered predicate similarities as a unique dataset feature which also leads to the biased prediction. Due to the feature, infrequent predicates (e.g., parked on, covered in) are easily misclassified as closely-related frequent predicates (e.g., on, in). Utilizing predicate similarities, we propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups. The classifiers aim to capture the differences among similar predicates in detail. We also introduce the idea of transfer learning to enhance the features for the predicates which lack sufficient training samples to learn the descriptive representations. The results of extensive experiments on the Visual Genome dataset show that the combination of our method and an existing debiasing approach greatly improves performance on tail predicates in challenging SGCls/SGDet tasks. Nonetheless, the overall performance of the proposed approach does not reach that of the current state of the art, so further analysis remains necessary as future work.
翻译:计算机图像中广泛应用地貌图示图示图解图像中显示的物体之间的关系,然而,这些应用尚未达到实际发展阶段,因为长尾的上游分布造成有偏差的培训。近年来,许多研究都解决这一问题。相比之下,相对较少的作品将上游相似点视为独特的数据集特征,这也会导致偏差的预测。由于特征,不常见的上游(例如停靠的、覆盖的)很容易被错误地分类为密切相关的频繁上游(例如,在其中)。利用上游相似点,我们提出一个新的分类办法,将这一过程分给若干细微的上游群体分类者。分类者的目的是详细捕捉类似上游之间的差异。我们还提出了转移学的想法,以加强缺乏足够培训样品以了解描述的上游特征。视觉基因组数据集的广泛实验结果显示,我们的方法和现有的脱偏见方法的结合大大改进了尾部上游在挑战SGCls/SGDet任务方面的绩效。但是,拟议的未来工作的总体绩效并没有达到必要的水平。因此,拟议的对当前艺术工作进行了进一步的分析。