Understanding human-object interactions is fundamental in First Person Vision (FPV). Tracking algorithms which follow the objects manipulated by the camera wearer can provide useful cues to effectively model such interactions. Despite a few previous attempts to exploit trackers in FPV applications, a methodical analysis of the performance of state-of-the-art visual trackers in this domain is still missing. In this short paper, we provide a recap of the first systematic study of object tracking in FPV. Our work extensively analyses the performance of recent and baseline FPV trackers with respect to different aspects. This is achieved through TREK-150, a novel benchmark dataset composed of 150 densely annotated video sequences. The results suggest that more research efforts should be devoted to this problem so that tracking could benefit FPV tasks. The full version of this paper is available at arXiv:2108.13665.
翻译:了解人类物体相互作用在第一人称视野(FPV)中至关重要。跟踪摄影机操纵的物体的跟踪算法可以为有效模拟这种相互作用提供有用的提示。尽管以前曾几次尝试在FPV应用中利用跟踪器,但仍缺少对该领域最新视觉跟踪器绩效的系统分析。在本短文中,我们提供了对FPV物体跟踪的首次系统研究的概要。我们的工作对FPV最近和基线跟踪器在不同方面的性能进行了广泛的分析。这是通过TREK-150实现的,这是一个由150个高密度附加说明的视频序列组成的新的基准数据集。结果表明,应当更多地研究这个问题,以便跟踪能够有利于FPV任务。本文的全文可在ArXiv:2108.13665上查阅。