Object detection is a basic but challenging task in computer vision, which plays a key role in a variety of industrial applications. However, object detectors based on deep learning usually require greater storage requirements and longer inference time, which hinders its practicality seriously. Therefore, a trade-off between effectiveness and efficiency is necessary in practical scenarios. Considering that without constraint of pre-defined anchors, anchor-free detectors can achieve acceptable accuracy and inference speed simultaneously. In this paper, we start from an anchor-free detector called TTFNet, modify the structure of TTFNet and introduce multiple existing tricks to realize effective server and mobile solutions respectively. Since all experiments in this paper are conducted based on PaddlePaddle, we call the model as PAFNet(Paddle Anchor Free Network). For server side, PAFNet can achieve a better balance between effectiveness (42.2% mAP) and efficiency (67.15 FPS) on a single V100 GPU. For moblie side, PAFNet-lite can achieve a better accuracy of (23.9% mAP) and 26.00 ms on Kirin 990 ARM CPU, outperforming the existing state-of-the-art anchor-free detectors by significant margins. Source code is at https://github.com/PaddlePaddle/PaddleDetection.
翻译:在计算机视野中,检测物体是一项基本但具有挑战性的任务,在各种工业应用中发挥着关键作用。然而,基于深层学习的物体探测器通常要求更多的存储要求和较长的推断时间,这严重妨碍其实用性。因此,在实际情景中,必须权衡效益和效率。考虑到不受预先定义的锚的限制,无锚探测器可以同时达到可接受的准确度和效率(67.15 FPS)之间的平衡。在本文中,我们从一个称为TTTFNet的无锚探测器开始,修改TTFNet的结构,并采用多种现有技巧来实现有效的服务器和移动解决方案。由于本文中的所有实验都是在PandlePaddle的基础上进行的,因此我们称该模型为PAFNet(Paddle Anchor Free Net)。对于服务器而言,PAFNet可以在一次性V100 GPU上(62.2% mAP.)和效率(67.15 FPS)之间实现更好的平衡。对于暴徒方面,PAFNet-lite可以实现更高的准确度(23.9 % mAP)和在Kirin 990 AR CD CPU/Dleamblearbleambleamalald Pal /Dalstalstalevalmentalmentaldalbard/d.