In the same vein of discriminative one-shot learning, Siamese networks allow recognizing an object from a single exemplar with the same class label. However, they do not take advantage of the underlying structure of the data and the relationship among the multitude of samples as they only rely on pairs of instances for training. In this paper, we propose a new quadruplet deep network to examine the potential connections among the training instances, aiming to achieve a more powerful representation. We design four shared networks that receive multi-tuple of instances as inputs and are connected by a novel loss function consisting of pair-loss and triplet-loss. According to the similarity metric, we select the most similar and the most dissimilar instances as the positive and negative inputs of triplet loss from each multi-tuple. We show that this scheme improves the training performance. Furthermore, we introduce a new weight layer to automatically select suitable combination weights, which will avoid the conflict between triplet and pair loss leading to worse performance. We evaluate our quadruplet framework by model-free tracking-by-detection of objects from a single initial exemplar in several Visual Object Tracking benchmarks. Our extensive experimental analysis demonstrates that our tracker achieves superior performance with a real-time processing speed of 78 frames-per-second (fps).
翻译:同样,在歧视性一拍学习的同一脉络中,暹罗网络允许从一个带有同一类标签的单一实例中识别一个对象,然而,它们并不利用数据的基本结构以及众多样本之间的关系,因为它们仅仅依赖一对培训案例。在本文件中,我们提议建立一个新的四倍深网络,以审查培训案例之间的潜在联系,目的是实现更强大的代表性。我们设计了四个共享网络,这些网络作为投入获得多个实例,并通过由一对损失和三重损失组成的新的损失功能连接起来。根据相似性指标,我们从每个多图中选择了最相似和最不相似的例子,作为三重损失的正负投入。我们表明这一计划改善了培训绩效。此外,我们引入一个新的加权结构,以自动选择合适的组合权重,从而避免三重损失和双重损失之间的冲突,从而导致更差的业绩。我们通过一个无模式的跟踪和三重损失的新损失框架来评估我们的四重框架。根据相似性指标,我们选择了最相似和最相似的例子,作为每个多重损失的三重损失的积极和负输入。 我们用几个视觉跟踪基准,用一个单一初始试视跟踪,展示了我们78轨道的先进轨道的实验框架。