Object-level data association is central to robotic applications such as tracking-by-detection and object-level simultaneous localization and mapping. While current learned visual data association methods outperform hand-crafted algorithms, many rely on large collections of domain-specific training examples that can be difficult to obtain without prior knowledge. Additionally, such methods often remain fixed during inference-time and do not harness observed information to better their performance. We propose a self-supervised method for incrementally refining visual descriptors to improve performance in the task of object-level visual data association. Our method optimizes deep descriptor generators online, by continuously training a widely available image classification network pre-trained with domain-independent data. We show that earlier layers in the network outperform later-stage layers for the data association task while also allowing for a 94% reduction in the number of parameters, enabling the online optimization. We show that self-labelling challenging triplets--choosing positive examples separated by large temporal distances and negative examples close in the descriptor space--improves the quality of the learned descriptors for the multi-object tracking task. Finally, we demonstrate that our approach surpasses other visual data-association methods applied to a tracking-by-detection task, and show that it provides better performance-gains when compared to other methods that attempt to adapt to observed information.
翻译:对象级数据协会是机器人应用的核心,例如跟踪逐度和对象级同步本地化和绘图。虽然目前所学的视觉数据协会方法优于手工制作的算法,但许多人依赖大量收集的域别培训实例,而无需事先了解便难以获得这些实例。此外,这些方法往往在推论时间保持不变,而且不利用观测到的信息来改善其性能。我们提出了一个自我监督的方法,用于逐步改进视觉描述符,以改进目标级视觉数据协会的性能。我们的方法优化了深度描述器生成器在线,持续培训一个广泛可用的图像分类网络,事先经过域独立数据培训。我们显示,网络中较早的层在数据协会任务中超越了后阶段层次,同时允许将参数数量减少94%,使在线优化成为可能。我们展示了自我标签式挑战三重感应选的正面实例,这些例子由大时间距离和离解码空间描述器接近的负面实例所分离。我们的方法优化了在多轨独立数据分类中学习的描述器设计器质量,通过持续训练,先于域性数据分类跟踪,然后演示其他可视化的跟踪性工作。最后展示了我们所观察到的可视化方法,展示了其他方法,以更精确地跟踪性能跟踪性能。我们所观察到的其他方法,展示了其他方法,从而展示了其他可观测性能。