In this work, we propose a novel uncertainty-aware object detection framework with a structured-graph, where nodes and edges are denoted by objects and their spatial-semantic similarities, respectively. Specifically, we aim to consider relationships among objects for effectively contextualizing them. To achieve this, we first detect objects and then measure their semantic and spatial distances to construct an object graph, which is then represented by a graph neural network (GNN) for refining visual CNN features for objects. However, refining CNN features and detection results of every object are inefficient and may not be necessary, as that include correct predictions with low uncertainties. Therefore, we propose to handle uncertain objects by not only transferring the representation from certain objects (sources) to uncertain objects (targets) over the directed graph, but also improving CNN features only on objects regarded as uncertain with their representational outputs from the GNN. Furthermore, we calculate a training loss by giving larger weights on uncertain objects, to concentrate on improving uncertain object predictions while maintaining high performances on certain objects. We refer to our model as Uncertainty-Aware Graph network for object DETection (UAGDet). We then experimentally validate ours on the challenging large-scale aerial image dataset, namely DOTA, that consists of lots of objects with small to large sizes in an image, on which ours improves the performance of the existing object detection network.
翻译:在这项工作中,我们提出了一个具有结构化图谱的新颖的有不确定性的物体探测框架,其中节点和边缘分别用物体及其空间-语义相似之处来表示。 具体地说, 我们的目标是考虑物体之间的关系, 以便有效地使其背景化。 为了实现这一目标, 我们首先检测物体, 然后测量其语义和空间距离, 以构建一个物体图, 该图随后以一个图形神经网络(GNN)为代表, 用于改进物体的视觉CNN特征。 然而, 改进CNN特性和每个物体的探测结果是无效的, 可能没有必要, 因为这包括精确的预测。 因此, 我们提议处理不确定的物体, 不仅将某些物体(来源) 的表示方式转移到方向图上不确定的物体( 目标), 而且还改进CNNC的功能, 仅针对那些被认为具有来自GNNNN的表示结果不确定性的物体。 此外, 我们计算培训损失的方法是, 对不确定的物体给予更大的重量, 专注于改进对象的预测, 同时保持某些物体的高度性能。 因此, 我们建议我们的模型是用于 DETetraction 大型探测的不确定物体的不确定物体的大规模图像, 。