Tiny objects, frequently appearing in practical applications, have weak appearance and features, and receive increasing interests in meany vision tasks, such as object detection and segmentation. To promote the research and development of tiny object tracking, we create a large-scale video dataset, which contains 434 sequences with a total of more than 217K frames. Each frame is carefully annotated with a high-quality bounding box. In data creation, we take 12 challenge attributes into account to cover a broad range of viewpoints and scene complexities, and annotate these attributes for facilitating the attribute-based performance analysis. To provide a strong baseline in tiny object tracking, we propose a novel Multilevel Knowledge Distillation Network (MKDNet), which pursues three-level knowledge distillations in a unified framework to effectively enhance the feature representation, discrimination and localization abilities in tracking tiny objects. Extensive experiments are performed on the proposed dataset, and the results prove the superiority and effectiveness of MKDNet compared with state-of-the-art methods. The dataset, the algorithm code, and the evaluation code are available at https://github.com/mmic-lcl/Datasets-and-benchmark-code.
翻译:微小物体经常出现在实际应用中,其外观和特征都非常薄弱,并且越来越关心细微的视觉任务,例如物体探测和分割等。为了促进小型物体追踪的研究和开发,我们建立了一个大型的视频数据集,其中包括434个序列,总共超过217K框架。每个框架都经过仔细的附加说明,并有一个高质量的捆绑框。在数据创建过程中,我们考虑到12个挑战属性,以涵盖广泛的视角和场景复杂性,并注明这些属性,以便利基于属性的性能分析。为了在微小物体追踪中提供一个强有力的基线,我们提议建立一个新的多层次知识蒸馏网络(MKDNet),在统一的框架内进行三级知识蒸馏,以有效加强跟踪小物体的特征代表、区别和本地化能力。在拟议的数据集上进行了广泛的实验,结果证明MKDNet与最新方法相比的优越性和有效性。数据集、算法代码和评价代码可在https://github.com/micard-colmatset.Daset.