We present SIMCO, the first agnostic multi-class object counting approach. SIMCO starts by detecting foreground objects through a novel Mask RCNN-based architecture trained beforehand (just once) on a brand-new synthetic 2D shape dataset, InShape; the idea is to highlight every object resembling a primitive 2D shape (circle, square, rectangle, etc.). Each object detected is described by a low-dimensional embedding, obtained from a novel similarity-based head branch; this latter implements a triplet loss, encouraging similar objects (same 2D shape + color and scale) to map close. Subsequently, SIMCO uses this embedding for clustering, so that different types of objects can emerge and be counted, making SIMCO the very first multi-class unsupervised counter. Experiments show that SIMCO provides state-of-the-art scores on counting benchmarks and that it can also help in many challenging image understanding tasks.
翻译:我们展示了第一个不可知的多级天体计法SIMCO。 SIMCO首先通过一个名为Mask RCNN的、预先(仅一次)在品牌新型合成 2D 形状数据集InShape上( InShape) 培训的新版Mask RCNN 的建筑来探测地表天体; 我们的想法是突出每个像原始 2D 形状( 圆形、 方形、 矩形等) 一样的物体。 检测到的天体都用一个低维嵌入来描述, 它来自一个基于相似性的新颖的头部分支; 后者实施三重损失, 鼓励类似的天体( 2D 形状 + 颜色 和 比例) 来绘制地图 。 随后, SIMCO 利用这种嵌入来进行分组, 以便不同种类的天体可以出现和被计算, 使 SIMCO 成为第一个非常多级且不高的反射镜。 实验显示, SIMCO 在计算基准时提供最先进的分数,, 并且它也可以帮助许多挑战的图像理解任务 。