In this paper we illustrate how to perform both visual object tracking and semi-supervised video object segmentation, in real-time, with a single simple approach. Our method, dubbed SiamMask, improves the offline training procedure of popular fully-convolutional Siamese approaches for object tracking by augmenting their loss with a binary segmentation task. Once trained, SiamMask solely relies on a single bounding box initialisation and operates online, producing class-agnostic object segmentation masks and rotated bounding boxes at 55 frames per second. Despite its simplicity, versatility and fast speed, our strategy allows us to establish a new state of the art among real-time trackers on VOT-2018, while at the same time demonstrating competitive performance and the best speed for the semi-supervised video object segmentation task on DAVIS-2016 and DAVIS-2017. The project website is http://www.robots.ox.ac.uk/~qwang/SiamMask.
翻译:在本文中,我们用单一简单的方法,说明如何实时进行视觉物体跟踪和半监视视频物体分离,我们称为SiamMask的方法,通过二进制分离任务,改进了广受欢迎的全革命暹粒物体追踪方法的离线培训程序,增加了其损失,SiamMask经过培训后,完全依靠一个单一的捆绑盒启动和在线操作,制作了等级的敏感物体分割面罩,并在每秒55个框上旋转捆绑盒。尽管我们的战略简单、多功能和速度很快,但我们仍然使我们能够在VOT-2018实时跟踪器中建立新的艺术状态,同时展示了DAVIS-2016和DAVIS-2017的竞争性性能和最佳速度,该项目网站是http://www.robots.ox.ac.uk/~qwang/Siammask。