The primary goal of this paper is to localize objects in a group of semantically similar images jointly, also known as the object co-localization problem. Most related existing works are essentially weakly-supervised, relying prominently on the neighboring images' weak-supervision. Although weak supervision is beneficial, it is not entirely reliable, for the results are quite sensitive to the neighboring images considered. In this paper, we combine it with a self-awareness phenomenon to mitigate this issue. By self-awareness here, we refer to the solution derived from the image itself in the form of saliency cue, which can also be unreliable if applied alone. Nevertheless, combining these two paradigms together can lead to a better co-localization ability. Specifically, we introduce a dynamic mediator that adaptively strikes a proper balance between the two static solutions to provide an optimal solution. Therefore, we call this method \textit{ASOC}: Adaptive Self-aware Object Co-localization. We perform exhaustive experiments on several benchmark datasets and validate that weak-supervision supplemented with self-awareness has superior performance outperforming several compared competing methods.
翻译:本文的首要目标是将物体放在一组相似的语义图像中,也称为物体共同定位问题。 大部分相关的现有作品基本上都受到微弱的监督, 并主要依赖相邻图像的薄弱监督。 虽然监管不力是有好处的, 但并不完全可靠, 因为结果对所考虑的相邻图像相当敏感。 在本文中, 我们把它与自我意识现象结合起来来缓解这一问题。 我们在这里通过自我意识, 提到了以显性提示的形式从图像本身中得出的解决方案, 如果单独应用的话, 也可能是不可靠的。 尽管如此, 将这两种模式结合起来可以导致更好的共同定位能力。 具体地说, 我们引入一个动态的调解人, 在两种静态解决方案之间实现适应性的适当平衡, 以提供最佳解决方案。 因此, 我们称之为“ textit{ ASOC} : 适应性自觉对象共同定位 ” 。 我们对几个基准数据集进行了彻底的实验, 并证实以自我认知为补充的弱性超前几个方法的功能优于竞争方法。