Various models have been proposed to solve the object detection problem. However, most of them require many hand-designed components to demonstrate good performance. To mitigate these issues, Transformer based DETR and its variant Deformable DETR were suggested. They solved much of the complex issue of designing a head of object detection model but it has not been generally clear that the Transformer-based models could be considered as the state-of-the-art method in object detection without doubt. Furthermore, as DETR adapted Transformer method only for the detection head, but still with including CNN for the backbone body, it has not been certain that it would be possible to build the competent end-to-end pipeline with the combination of attention modules. In this paper, we propose that combining several attention modules with our new Task Specific Split Transformer(TSST) is a fairly good enough method to produce the best COCO results without traditionally hand-designed components. By splitting generally purposed attention module into two separated mission specific attention module, the proposed method addresses the way to design simpler object detection models than before. Extensive experiments on the COCO benchmark demonstrate the effectiveness of our approach. Code is released at https://github.com/navervision/tsst
翻译:为解决物体探测问题,提出了各种模型,但大多数模型需要许多手工设计的部件才能显示良好的性能。为了减轻这些问题,建议了以变换器为基础的DETR及其变异式可变式DETR。它们解决了设计物体探测模型头的很多复杂问题,但一般说来并不清楚以变换器为基础的模型是否可被视为物体探测中最先进的方法。此外,由于变换器只对探测头采用DERTR改造变异器方法,但仍包括主干机的CNN,因此,尚无法确定能否用关注模块组合来建立合格的端到端管道。在本文件中,我们建议将几个关注模块与我们新的任务特定分解变器(TST)相结合,是一种相当好的方法,足以产生最佳COCO结果,而没有传统的手工设计组件。通过将一般目的的注意模块分为两个分离的特定注意模块,拟议的方法解决了设计比以前更简单的物体探测模型的方法。COCO公司基准的广泛实验证明了我们的方法的有效性。在 https://giftvision/ubstcoms上公布守则。