LiDAR-based 3D single object tracking is a challenging issue in robotics and autonomous driving. Currently, existing approaches usually suffer from the problem that objects at long distance often have very sparse or partially-occluded point clouds, which makes the features extracted by the model ambiguous. Ambiguous features will make it hard to locate the target object and finally lead to bad tracking results. To solve this problem, we utilize the powerful Transformer architecture and propose a Point-Track-Transformer (PTT) module for point cloud-based 3D single object tracking task. Specifically, PTT module generates fine-tuned attention features by computing attention weights, which guides the tracker focusing on the important features of the target and improves the tracking ability in complex scenarios. To evaluate our PTT module, we embed PTT into the dominant method and construct a novel 3D SOT tracker named PTT-Net. In PTT-Net, we embed PTT into the voting stage and proposal generation stage, respectively. PTT module in the voting stage could model the interactions among point patches, which learns context-dependent features. Meanwhile, PTT module in the proposal generation stage could capture the contextual information between object and background. We evaluate our PTT-Net on KITTI and NuScenes datasets. Experimental results demonstrate the effectiveness of PTT module and the superiority of PTT-Net, which surpasses the baseline by a noticeable margin, ~10% in the Car category. Meanwhile, our method also has a significant performance improvement in sparse scenarios. In general, the combination of transformer and tracking pipeline enables our PTT-Net to achieve state-of-the-art performance on both two datasets. Additionally, PTT-Net could run in real-time at 40FPS on NVIDIA 1080Ti GPU. Our code is open-sourced for the research community at https://github.com/shanjiayao/PTT.
翻译:以 3DAR 为基础的 3D 单一对象跟踪是机器人和自主驱动中一个具有挑战性的问题。 目前, 现有方法通常会遇到一个问题, 远距离的物体往往会发现非常稀少或部分隐蔽的点云, 这使得模型所提取的特征模糊不清。 模糊的特性将难以定位目标对象, 最终导致错误的跟踪结果。 为了解决这个问题, 我们使用强大的变压器架构, 并为点基于云的 3D 单一对象跟踪任务提议一个点- Trac- Trade( PTTT) 模块。 具体地说, PTT 模块通过计算关注重量, 引导跟踪器关注目标的重要特征, 并改进复杂情况下的跟踪能力。 为了评估我们的 PTTF 模块, 我们将PTT 嵌入主控模块, 并建立一个名为 PTFT/ 提议生成的新的 3DTF 工具。 在 PDI 平台上, 将PTTF 和 NTTT 运行一个显著的运行工具, 以我们的直径直径化工具, 和 KTTTTTF 运行中, 运行中, 将显示我们的背景数据。