The scene graph generation has gained tremendous progress in recent years. However, its intrinsic long-tailed distribution of predicate classes is a challenging problem. Almost all existing scene graph generation (SGG) methods follow the same framework where they use a similar backbone network for object detection and a customized network for scene graph generation. These methods often design the sophisticated context-encoder to extract the inherent relevance of scene context w.r.t the intrinsic predicates and complicated networks to improve the learning capabilities of the network model for highly imbalanced data distributions. To address the unbiased SGG problem, we present a simple yet effective method called Context-Aware Mixture-of-Experts (CAME) to improve the model diversity and alleviate the biased SGG without a sophisticated design. Specifically, we propose to use the mixture of experts to remedy the heavily long-tailed distributions of predicate classes, which is suitable for most unbiased scene graph generators. With a mixture of relation experts, the long-tailed distribution of predicates is addressed in a divide and ensemble manner. As a result, the biased SGG is mitigated and the model tends to make more balanced predicates predictions. However, experts with the same weight are not sufficiently diverse to discriminate the different levels of predicates distributions. Hence, we simply use the build-in context-aware encoder, to help the network dynamically leverage the rich scene characteristics to further increase the diversity of the model. By utilizing the context information of the image, the importance of each expert w.r.t the scene context is dynamically assigned. We have conducted extensive experiments on three tasks on the Visual Genome dataset to show that came achieved superior performance over previous methods.
翻译:近些年来,景象图的生成取得了巨大的进步。然而,其内在的、长尾的上游等级分布是一个具有挑战性的问题。几乎所有现有的景象图生成方法都遵循同样的框架,即它们使用类似的主干网络来探测物体和定制的图像生成网络。这些方法往往设计精密的环境编码器,以提取景象背景的内在前提和复杂的网络的内在相关性,从而提高网络模型在高度不平衡的数据分布方面的学习能力。为了解决不偏不倚的 SGG问题,我们提出了一个简单而有效的方法,称为Econ-Aware Mixture-Expleters(CAME),目的是改进模型的多样性,减轻有偏向的SGGG,而没有复杂的设计。具体地说,我们建议使用专家的混合方法来纠正高度长尾细的上游等级分布,这适合于最不带偏见的场景图生成者。随着关系专家的混合,定型模型的长尾部分布以分裂和混合的方式进行。由于结果,有偏差的SGGGAR-Ex(C)背景的缩略地减少了,而使每个模型具有帮助度背景的比重度的数值变得更小。我们利用了更均衡的视野的视野的数值分布,因此,我们利用了更均衡的地形的状态上层的分布是用来进行着更均衡地进行着更深层层系。