Single object tracking (SOT) research falls into a cycle - trackers perform well on most benchmarks but quickly fail in challenging scenarios, causing researchers to doubt the insufficient data content and take more effort constructing larger datasets with more challenging situations. However, isolated experimental environments and limited evaluation methods more seriously hinder the SOT research. The former causes existing datasets can not be exploited comprehensively, while the latter neglects challenging factors in the evaluation process. In this article, we systematize the representative benchmarks and form a single object tracking metaverse (SOTVerse) - a user-defined SOT task space to break through the bottleneck. We first propose a 3E Paradigm to describe tasks by three components (i.e., environment, evaluation, and executor). Then, we summarize task characteristics, clarify the organization standards, and construct SOTVerse with 12.56 million frames. Specifically, SOTVerse automatically labels challenging factors per frame, allowing users to generate user-defined spaces efficiently via construction rules. Besides, SOTVerse provides two mechanisms with new indicators and successfully evaluates trackers under various subtasks. Consequently, SOTVerse firstly provides a strategy to improve resource utilization in the computer vision area, making research more standardized and scientific. The SOTVerse, toolkit, evaluation server, and results are available at http://metaverse.aitestunion.com.
翻译:单一对象跟踪(SOT)研究进入一个周期——跟踪者在大多数基准上表现良好,但在具有挑战性的情景下却很快失败,导致研究人员怀疑数据内容不足,并更加努力地在更具挑战性的情况下构建更大的数据集。然而,孤立的实验环境和有限的评价方法更严重地阻碍了SOT的研究。前者造成现有数据集无法全面开发,而后者忽视了评价过程中具有挑战性的因素。在文章中,我们将代表性基准系统化,形成一个单一的物体跟踪元体(SOTVerse)——一个用户定义的SOTT任务空间,以打破瓶颈。我们首先提议一个3E参数,用三个组成部分(即环境、评价和执行者)描述任务。然后,我们总结任务特点,澄清组织标准,用1,256万框架构建SOTVerse自动标出具有挑战性的因素,使用户能够通过建筑规则高效率地生成用户定义的空间。此外,SOTVerse提供两个机制,新的指标,并成功评估各种子级联盟下的跟踪者。因此,SOVerspecial seral supreal reserview are laveal reserview areal ress bematiews to freferviews bes toform laveal lautes.