Scene understanding is an essential and challenging task in computer vision. To provide the visually fundamental graphical structure of an image, the scene graph has received increased attention due to its powerful semantic representation. However, it is difficult to draw a proper scene graph for image retrieval, image generation, and multi-modal applications. The conventional scene graph annotation interface is not easy to use in image annotations, and the automatic scene graph generation approaches using deep neural networks are prone to generate redundant content while disregarding details. In this work, we propose SGDraw, a scene graph drawing interface using object-oriented scene graph representation to help users draw and edit scene graphs interactively. For the proposed object-oriented representation, we consider the objects, attributes, and relationships of objects as a structural unit. SGDraw provides a web-based scene graph annotation and generation tool for scene understanding applications. To verify the effectiveness of the proposed interface, we conducted a comparison study with the conventional tool and the user experience study. The results show that SGDraw can help generate scene graphs with richer details and describe the images more accurately than traditional bounding box annotations. We believe the proposed SGDraw can be useful in various vision tasks, such as image retrieval and generation.
翻译:为了提供图像的视觉基本图形结构,场景图由于具有很强的语义表达方式而得到越来越多的关注。然而,很难为图像检索、图像生成和多式应用程序绘制适当的场景图。常规场景图说明界面不容易用于图像说明,使用深神经网络的自动场景图生成方法容易产生多余的内容,而忽略细节。在这项工作中,我们提议SGDraw,一个使用对象导向的场景图绘制界面,用对象导向的场景图显示方式帮助用户以互动方式绘制和编辑场景图。对于拟议的面向目标的表示方式,我们把对象对象的物体、属性和关系视为结构单位。SGDraw为场景应用提供了一个基于网络的场景图说明和生成工具。为了核实拟议的界面的有效性,我们与传统工具及用户经验研究进行了比较研究。结果显示,SGDraw能够帮助生成更丰富细节的场景图,并更准确地描述图像,而不是传统的绑框说明。我们认为,拟议的SGDraw能够将所拟议的图像作为各种图像的检索作为有用的工具。