Current 3D single object tracking approaches track the target based on a feature comparison between the target template and the search area. However, due to the common occlusion in LiDAR scans, it is non-trivial to conduct accurate feature comparisons on severe sparse and incomplete shapes. In this work, we exploit the ground truth bounding box given in the first frame as a strong cue to enhance the feature description of the target object, enabling a more accurate feature comparison in a simple yet effective way. In particular, we first propose the BoxCloud, an informative and robust representation, to depict an object using the point-to-box relation. We further design an efficient box-aware feature fusion module, which leverages the aforementioned BoxCloud for reliable feature matching and embedding. Integrating the proposed general components into an existing model P2B, we construct a superior box-aware tracker (BAT). Experiments confirm that our proposed BAT outperforms the previous state-of-the-art by a large margin on both KITTI and NuScenes benchmarks, achieving a 12.8% improvement in terms of precision while running ~20% faster.
翻译:当前 3D 单个对象跟踪方法基于目标模板和搜索区域之间的特征比较, 跟踪目标。 但是, 由于LiDAR 扫描中常见的封闭性, 对严重稀疏和不完整的形状进行准确的特征比较并非三重性。 在这项工作中, 我们利用第一个框架给定的地面真相约束框, 作为加强目标对象特征描述的有力提示, 能够以简单而有效的方式进行更准确的特征比较。 特别是, 我们首先提议 BoxCloud, 一个信息丰富和强有力的演示, 用点对箱关系描述一个对象。 我们进一步设计一个高效的箱对箱组合性特征模块, 利用上述箱状组合进行可靠的特征匹配和嵌入。 将拟议的一般组件纳入现有的模型 P2B, 我们建造了一个高级的箱对质跟踪器( BAT)。 实验证实, 我们提议的BAT 在 KITTI 和 Nuscenes 基准上都大大的比值差, 实现12.8%的精确度改进, 同时运行 ~ 20% 。