Weakly-Supervised Object Detection (WSOD) and Localization (WSOL), i.e., detecting multiple and single instances with bounding boxes in an image using image-level labels, are long-standing and challenging tasks in the CV community. With the success of deep neural networks in object detection, both WSOD and WSOL have received unprecedented attention. Hundreds of WSOD and WSOL methods and numerous techniques have been proposed in the deep learning era. To this end, in this paper, we consider WSOL is a sub-task of WSOD and provide a comprehensive survey of the recent achievements of WSOD. Specifically, we firstly describe the formulation and setting of the WSOD, including the background, challenges, basic framework. Meanwhile, we summarize and analyze all advanced techniques and training tricks for improving detection performance. Then, we introduce the widely-used datasets and evaluation metrics of WSOD. Lastly, we discuss the future directions of WSOD. We believe that these summaries can help pave a way for future research on WSOD and WSOL.
翻译:WSOD和WSOL在深层次的学习时代提出了数百种WSOD和WSOL方法和许多技术。为此,我们认为WSOL是WSOD的子任务,对WSOD最近的成就进行综合调查。具体地说,我们首先描述了WSOD的配制和设置,包括背景、挑战、基本框架。与此同时,我们总结和分析所有先进技术和训练技巧,以改进探测性能。然后,我们介绍广泛使用的WSOD的数据集和评价指标。最后,我们讨论WSOD的未来方向。我们认为,这些摘要有助于为WSOD和WSOL的未来研究铺平道路。