学习利用共同制约因素跟踪部件的强大学会 (Learning a Robust Society of Tracking Parts using Co-occurrence Constraints)

Object tracking is an essential problem in computer vision that has been researched for several decades. One of the main challenges in tracking is to adapt to object appearance changes over time, in order to avoid drifting to background clutter. We address this challenge by proposing a deep neural network architecture composed of different parts, which functions as a society of tracking parts. The parts work in conjunction according to a certain policy and learn from each other in a robust manner, using co-occurrence constraints that ensure robust inference and learning. From a structural point of view, our network is composed of two main pathways. One pathway is more conservative. It carefully monitors a large set of simple tracker parts learned as linear filters over deep feature activation maps. It assigns the parts different roles. It promotes the reliable ones and removes the inconsistent ones. We learn these filters simultaneously in an efficient way, with a single closed-form formulation for which we propose novel theoretical properties. The second pathway is more progressive. It is learned completely online and thus it is able to better model object appearance changes. In order to adapt in a robust manner, it is learned only on highly confident frames, which are decided using co-occurrences with the first pathway. Thus, our system has the full benefit of two main approaches in tracking. The larger set of simpler filter parts offers robustness, while the full deep network learned online provides adaptability to change. As shown in the experimental section, our approach achieves state of the art performance on the challenging VOT17 benchmark, outperforming the existing published methods both on the general EAO metric as well as in the number of fails by a significant margin.

翻译：计算机对象跟踪是计算机视野中一个基本的问题,已经研究了几十年。跟踪的主要挑战之一是适应随时间变化的物体外观变化,以避免漂移到背景混乱中。我们通过提出由不同部分组成的由不同部分组成的深神经网络架构来应对这一挑战, 这些部分作为跟踪部件的社会功能。部件根据某种政策一起工作, 相互学习, 使用确保强力推断和学习的共同发现限制。从结构角度看, 我们的网络由两个主要路径组成。一条路径比较保守。它仔细监测一组简单的跟踪器部件, 学习作为深度特征激活地图的线性过滤器。它分配不同部分。它促进可靠的部分, 消除不一致的部分。我们同时学习这些过滤器, 使用一种单一的封闭式设计, 使用新的理论属性。第二个路径比较进步。它从全线上学习, 从而能够更好地模拟对象变化。为了以稳健的方式适应, 它只在高度自信的轨迹基准中学习一系列的简单跟踪部分。它只是以更自信的直线性基准, 以更精确的方式展示了我们现有的整个网络的轨迹。