视觉数据协会通过自标三边加强在线描述词 (Online Descriptor Enhancement via Self-Labelling Triplets for Visual Data Association)

from arxiv, Under review for the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021). This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Object-level data association is central to robotic applications such as tracking-by-detection and object-level simultaneous localization and mapping. While current learned visual data association methods outperform hand-crafted algorithms, many rely on large collections of domain-specific training examples that can be difficult to obtain without prior knowledge. Additionally, such methods often remain fixed during inference-time and do not harness observed information to better their performance. We propose a self-supervised method for incrementally refining visual descriptors to improve performance in the task of object-level visual data association. Our method optimizes deep descriptor generators online, by continuously training a widely available image classification network pre-trained with domain-independent data. We show that earlier layers in the network outperform later-stage layers for the data association task while also allowing for a 94% reduction in the number of parameters, enabling the online optimization. We show that self-labelling challenging triplets--choosing positive examples separated by large temporal distances and negative examples close in the descriptor space--improves the quality of the learned descriptors for the multi-object tracking task. Finally, we demonstrate that our approach surpasses other visual data-association methods applied to a tracking-by-detection task, and show that it provides better performance-gains when compared to other methods that attempt to adapt to observed information.

翻译：对象级数据协会是机器人应用的核心,例如跟踪逐度和对象级同步本地化和绘图。虽然目前所学的视觉数据协会方法优于手工制作的算法,但许多人依赖大量收集的域别培训实例,而无需事先了解便难以获得这些实例。此外,这些方法往往在推论时间保持不变,而且不利用观测到的信息来改善其性能。我们提出了一个自我监督的方法,用于逐步改进视觉描述符,以改进目标级视觉数据协会的性能。我们的方法优化了深度描述器生成器在线,持续培训一个广泛可用的图像分类网络,事先经过域独立数据培训。我们显示,网络中较早的层在数据协会任务中超越了后阶段层次,同时允许将参数数量减少94%,使在线优化成为可能。我们展示了自我标签式挑战三重感应选的正面实例,这些例子由大时间距离和离解码空间描述器接近的负面实例所分离。我们的方法优化了在多轨独立数据分类中学习的描述器设计器质量,通过持续训练,先于域性数据分类跟踪,然后演示其他可视化的跟踪性工作。最后展示了我们所观察到的可视化方法,展示了其他方法,以更精确地跟踪性能跟踪性能。我们所观察到的其他方法,展示了其他方法,从而展示了其他可观测性能。