The evaluation of object detection models is usually performed by optimizing a single metric, e.g. mAP, on a fixed set of datasets, e.g. Microsoft COCO and Pascal VOC. Due to image retrieval and annotation costs, these datasets consist largely of images found on the web and do not represent many real-life domains that are being modelled in practice, e.g. satellite, microscopic and gaming, making it difficult to assert the degree of generalization learned by the model. We introduce the Roboflow-100 (RF100) consisting of 100 datasets, 7 imagery domains, 224,714 images, and 805 class labels with over 11,170 labelling hours. We derived RF100 from over 90,000 public datasets, 60 million public images that are actively being assembled and labelled by computer vision practitioners in the open on the web application Roboflow Universe. By releasing RF100, we aim to provide a semantically diverse, multi-domain benchmark of datasets to help researchers test their model's generalizability with real-life data. RF100 download and benchmark replication are available on GitHub.
翻译:对物体探测模型的评价通常是通过优化固定数据集的单一指标,例如微软COCO和Pascal VOC。由于图像检索和注解成本,这些数据集主要由网上发现的图像组成,并不代表实际中正在模拟的许多现实生活领域,例如卫星、微微科和赌博,因此难以肯定模型所学的概括程度。我们采用了Roboflow-100(RF100),其中包括100数据集、7个图像域、224 714图像和805个等级标签,标记时间超过11 170个标签小时。我们从90 000多个公共数据集中得出了RF100,在网络应用程序Roboflow宇宙中计算机视觉操作员正在积极收集并贴上6 000万个公共图像。我们通过释放RF100,旨在提供一套多种多样的多维域数据集基准,以帮助研究人员测试其模型与真实生命数据的通用性。在Gi-H上可以下载和复制。