In this paper, we propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data. We further extend it to video data by proposing a two-stage tracking-based scheme for detection refinement. The proposed method can be used as a standalone approach for improving object detection performance, or as a part of a framework for faster bounding box annotation in unseen datasets, assuming that the objects of interest are those present in some common public datasets.
翻译:在本文中,我们建议采用一种方法,将多物体探测器的输出综合起来,以提高探测性能和图像数据捆绑框的精确度;我们通过提出基于两阶段的跟踪改进探测计划,进一步将其扩大到视频数据;提议的方法可以作为提高物体探测性能的单独办法,或作为在未见数据集中更快捆绑框注的一个框架的一部分,假设感兴趣的对象是某些公共共同数据集中存在的对象。