In this paper, we introduce a contrastive learning framework for keypoint detection (CoKe). Keypoint detection differs from other visual tasks where contrastive learning has been applied because the input is a set of images in which multiple keypoints are annotated. This requires the contrastive learning to be extended such that the keypoints are represented and detected independently, which enables the contrastive loss to make the keypoint features different from each other and from the background. Our approach has two benefits: It enables us to exploit contrastive learning for keypoint detection, and by detecting each keypoint independently the detection becomes more robust to occlusion compared to holistic methods, such as stacked hourglass networks, which attempt to detect all keypoints jointly. Our CoKe framework introduces several technical innovations. In particular, we introduce: (i) A clutter bank to represent non-keypoint features; (ii) a keypoint bank that stores prototypical representations of keypoints to approximate the contrastive loss between keypoints; and (iii) a cumulative moving average update to learn the keypoint prototypes while training the feature extractor. Our experiments on a range of diverse datasets (PASCAL3D+, MPII, ObjectNet3D) show that our approach works as well, or better than, alternative methods for keypoint detection, even for human keypoints, for which the literature is vast. Moreover, we observe that CoKe is exceptionally robust to partial occlusion and previously unseen object poses.
翻译:在本文中, 我们为关键点检测引入了一个对比式学习框架。 关键点检测与其他视觉任务不同, 因为输入是一组图像, 多关键点附加注释。 这要求扩展对比式学习, 使关键点得到代表并独立检测, 使得对比式损失能够使关键点特征彼此和背景不同。 我们的方法有两个好处 : 它使我们能够利用对比性学习来检测关键点, 并且通过独立检测每个关键点, 与整体方法相比, 检测对隔离性更加强大, 例如堆叠式沙眼网络, 试图共同检测所有关键点。 我们的 CoKe 框架引入了几项技术创新。 特别是, 我们引入了:(一) 结晶银行代表非关键点特征;(二) 关键点库, 储存关键点的原型描述以近似于关键点之间的对比性损失; 和 (三) 累计移动平均更新以学习关键点原型, 同时培训特征提取器。 我们的实验范围是多种数据检测方法, 之前的卡路里基点是甚点, 显示甚点的卡路段, 。