3D single object tracking is a key issue for robotics. In this paper, we propose a transformer module called Point-Track-Transformer (PTT) for point cloud-based 3D single object tracking. PTT module contains three blocks for feature embedding, position encoding, and self-attention feature computation. Feature embedding aims to place features closer in the embedding space if they have similar semantic information. Position encoding is used to encode coordinates of point clouds into high dimension distinguishable features. Self-attention generates refined attention features by computing attention weights. Besides, we embed the PTT module into the open-source state-of-the-art method P2B to construct PTT-Net. Experiments on the KITTI dataset reveal that our PTT-Net surpasses the state-of-the-art by a noticeable margin (~10%). Additionally, PTT-Net could achieve real-time performance (~40FPS) on NVIDIA 1080Ti GPU. Our code is open-sourced for the robotics community at https://github.com/shanjiayao/PTT.
翻译:3D 单个天体跟踪是机器人的一个关键问题。 在本文中, 我们提出一个名为 Point- Track- Transfent (PTT) 的变压器模块, 用于点云基三维单个天体跟踪。 PTT 模块包含三个功能嵌入、 位置编码和自我注意特性计算块块块。 特性嵌入的目的是在嵌入空间中定位相近的特征。 位置编码用于将点云坐标编码为高维可辨特性。 自我注意通过计算引力来产生精细化的注意功能。 此外, 我们将 PTT 模块嵌入开放源端状态的P2B 方法中, 以构建 PTT- Net。 KITTI 数据集的实验显示, 我们的 PTTT- Net 将超过此状态, 以一个显著的边距 (~ 10% ) 。 此外, PTTTT- Net 可以在 NVIDIA 1080Ti GPU 上实现实时性能(~ 40FS) 。 我们的代码是在 https://github.com/ shanjiayao/ PTTTTTT.