Scene Graph Generation has gained much attention in computer vision research with the growing demand in image understanding projects like visual question answering, image captioning, self-driving cars, crowd behavior analysis, activity recognition, and more. Scene graph, a visually grounded graphical structure of an image, immensely helps to simplify the image understanding tasks. In this work, we introduced a post-processing algorithm called Geometric Context to understand the visual scenes better geometrically. We use this post-processing algorithm to add and refine the geometric relationships between object pairs to a prior model. We exploit this context by calculating the direction and distance between object pairs. We use Knowledge Embedded Routing Network (KERN) as our baseline model, extend the work with our algorithm, and show comparable results on the recent state-of-the-art algorithms.
翻译:在计算机视觉研究中,Scene Graph Creagenation 引起了人们的极大关注,在视觉问答、图像字幕、自行驾驶汽车、人群行为分析、活动识别等图像理解项目的需求日益增加。 景色图是一个图像的直观图形结构,非常有助于简化图像理解任务。 在这项工作中,我们引入了后处理算法,称为“几何背景”,以更好地了解视觉场景。我们使用后处理算法来增加和完善对对象之间与先前模型的几何关系。我们利用这个背景,计算对象对子之间的方向和距离。我们使用知识嵌入式运行网络(KERN)作为我们的基线模型,扩展我们的算法工作,并展示最新最新算法的可比结果。