Single object tracking in point clouds has been attracting more and more attention owing to the presence of LiDAR sensors in 3D vision. However, the existing methods based on deep neural networks focus mainly on training different models for different categories, which makes them unable to perform well in real-world applications when encountering classes unseen during the training phase. In this work, we thus turn our thoughts to a more challenging task in the LiDAR point clouds, class-agnostic tracking, where a general model is supposed to be learned for any specified targets of both observed and unseen categories. In particular, we first investigate the class-agnostic performances of the state-of-the-art trackers via exposing the unseen categories to them during testing, finding that a key factor for class-agnostic tracking is how to constrain fused features between the template and search region to maintain generalization when the distribution is shifted from observed to unseen classes. Therefore, we propose a feature decorrelation method to address this problem, which eliminates the spurious correlations of the fused features through a set of learned weights and further makes the search region consistent among foreground points and distinctive between foreground and background points. Experiments on the KITTI and NuScenes demonstrate that the proposed method can achieve considerable improvements by benchmarking against the advanced trackers P2B and BAT, especially when tracking unseen objects.
翻译:由于3D愿景中存在LIDAR传感器,因此对点云的单一对象跟踪越来越引起越来越多的关注。然而,基于深神经网络的现有方法主要侧重于对不同类别的不同模型进行培训,从而使它们在培训阶段遇到看不见的班级时无法在现实世界应用中很好地发挥作用。在这项工作中,我们的想法因此转向了LIDAR点云层中更具挑战性的任务,即阶级认知跟踪,因为对于任何被观测和不可见类别的特定目标,都必须学习一个通用模型。特别是,我们首先通过在测试期间向它们展示不可见的类别,来调查最先进的跟踪器的等级-不可知性性表现,发现在测试期间,课堂认知性跟踪的一个关键因素是如何限制模板和搜索区域之间的连接性特征,以便在分布从观测到的班级转向不可见的班级时,保持总体化。因此,我们建议一种特征调节方法来解决这一问题,通过一套学习的重量来消除混集特征的虚假关联性关联性关系,并进一步使搜索区域在前地点和前方轨道上保持一致性,特别是前方轨道和前方轨道上,可以证明“N-AAT”。