We propose YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images. To achieve this, we make two improvements to the state-of-the-art image-based real-time method YOLACT: (1) TensorRT optimization while carefully trading off speed and accuracy, and (2) a novel feature warping module to exploit temporal redundancy in videos. Experiments on the YouTube VIS and MS COCO datasets demonstrate that YolactEdge produces a 3-5x speed up over existing real-time methods while producing competitive mask and box detection accuracy. We also conduct ablation studies to dissect our design choices and modules. Code and models are available at https://github.com/haotian-liu/yolact_edge.
翻译:我们提议采用YolactEdge(YolactEdge),这是在小型边缘装置上以实时速度运行的第一个竞争性图像分割法。具体地说,YolactEdge(YolactEdge)在Jetson AgX XX Xavier(和RTX 2080 Ti)上运行多达30.8FPS(和172.7FPS),在550x550分辨率图像上使用ResNet-101主干柱。为了实现这一点,我们改进了最先进的基于图像的实时方法YOLACT:(1)TensorRT优化,同时谨慎地交换速度和准确性,以及(2)利用视频中的时间冗余的新型特征扭曲模块。对YouTubeVIS和MS COCO数据集的实验表明,YolactEdge(YolactEdge)在实时方法上产生3-5x速度,同时产生竞争性的遮罩和盒检测精度。我们还进行模拟研究,以解析我们的设计选择和模块。代码和模型可在https://github.com/haitian-liu/yolact_sedge上查阅。