Relation context has been proved to be useful for many challenging vision tasks. In the field of 3D object detection, previous methods have been taking the advantage of context encoding, graph embedding, or explicit relation reasoning to extract relation context. However, there exists inevitably redundant relation context due to noisy or low-quality proposals. In fact, invalid relation context usually indicates underlying scene misunderstanding and ambiguity, which may, on the contrary, reduce the performance in complex scenes. Inspired by recent attention mechanism like Transformer, we propose a novel 3D attention-based relation module (ARM3D). It encompasses object-aware relation reasoning to extract pair-wise relation contexts among qualified proposals and an attention module to distribute attention weights towards different relation contexts. In this way, ARM3D can take full advantage of the useful relation context and filter those less relevant or even confusing contexts, which mitigates the ambiguity in detection. We have evaluated the effectiveness of ARM3D by plugging it into several state-of-the-art 3D object detectors and showing more accurate and robust detection results. Extensive experiments show the capability and generalization of ARM3D on 3D object detection. Our source code is available at https://github.com/lanlan96/ARM3D.
翻译:在3D对象探测领域,以往的方法一直利用上下文编码、图形嵌入或直线关系推理法,以利用上下文编码、图形嵌入或直线关系推理来提取关系背景。然而,由于吵闹或低质量的建议,必然存在多余的关系背景。事实上,无效关系背景通常表明潜在的场景误解和模糊,这可能会减少复杂场景的性能。在诸如变异器等最近关注机制的启发下,我们提议了一个新的3D关注型关系模块(ARM3D)。它包括了在合格的提议和关注模块中提取双向关系背景,将注意力权重分散到不同的关系背景。在这方面,ARM3D可以充分利用有用的关系背景,过滤那些不太相关或甚至混乱的环境,从而减轻探测中的模糊性。我们通过将ARM3D连接到几个状态的3D物体探测器,并显示更准确和可靠的探测结果,从而评估了ARM3D的有效性。广泛的实验显示AR3D在3D对象探测3D对象方面的能力和普及性。我们的源码可在 http://AmlanDs.