三合一的对立三合一配对与重建 (Discriminative Triad Matching and Reconstruction for Weakly Referring Expression Grounding)

In this paper, we are tackling the weakly-supervised referring expression grounding task, for the localization of a referent object in an image according to a query sentence, where the mapping between image regions and queries are not available during the training stage. In traditional methods, an object region that best matches the referring expression is picked out, and then the query sentence is reconstructed from the selected region, where the reconstruction difference serves as the loss for back-propagation. The existing methods, however, conduct both the matching and the reconstruction approximately as they ignore the fact that the matching correctness is unknown. To overcome this limitation, a discriminative triad is designed here as the basis to the solution, through which a query can be converted into one or multiple discriminative triads in a very scalable way. Based on the discriminative triad, we further propose the triad-level matching and reconstruction modules which are lightweight yet effective for the weakly-supervised training, making it three times lighter and faster than the previous state-of-the-art methods. One important merit of our work is its superior performance despite the simple and neat design. Specifically, the proposed method achieves a new state-of-the-art accuracy when evaluated on RefCOCO (39.21%), RefCOCO+ (39.18%) and RefCOCOg (43.24%) datasets, that is 4.17%, 4.08% and 7.8% higher than the previous one, respectively.

翻译：在本文中,我们正在处理受监管不力的参考表达基础任务, 以便根据查询句将图像区域与查询之间无法进行绘图的查询对象定位到一处, 以便根据询问句, 在培训阶段无法提供图像区域与查询之间的映像和查询。在传统方法中, 选择一个与查询表达方式最匹配的对象区域, 然后从选定区域重建查询句, 重建差异可以作为反反向调整的损失。但是, 现有的方法进行匹配和重建, 因为它们忽视了匹配正确性未知的事实。为了克服这一限制, 此处设计了一个有区别性更高的三角, 以此将查询转换成一个或多个有区别的三角。根据有区别的三角区域, 我们进一步建议三轨级匹配和重建模块, 这些模块的重量较轻,但对于薄弱的超强培训有效, 使得它比先前的状态方法要轻三倍和更快。为了克服这一限制, 我们工作的一个重要优点是其优性表现, 尽管前者的精确度是简单和精确的 CO-% 。。具体地说, Ref, 拟议的方法实现了一种状态( CO-% ) 和Re- 的精确地说, Re- re- 的精确地说, Re- 和Re- re- 的实现一种状态, Re- o- 数据- o- b- 的的的和Re- o- b- b- b- b- b- b- b- b- b- b- b- b- b- b- b- b- b- d- d- d- d- d- d- d- d- b- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d- d-