Learning effective visual representations without human supervision is a long-standing problem in computer vision. Recent advances in self-supervised learning algorithms have utilized contrastive learning, with methods such as SimCLR, which applies a composition of augmentations to an image, and minimizes a contrastive loss between the two augmented images. In this paper, we present CLAWS, an annotation-efficient learning framework, addressing the problem of manually labeling large-scale agricultural datasets along with potential applications such as anomaly detection and plant growth analytics. CLAWS uses a network backbone inspired by SimCLR and weak supervision to investigate the effect of contrastive learning within class clusters. In addition, we inject a hard attention mask to the cropped input image before maximizing agreement between the image pairs using a contrastive loss function. This mask forces the network to focus on pertinent object features and ignore background features. We compare results between a supervised SimCLR and CLAWS using an agricultural dataset with 227,060 samples consisting of 11 different crop classes. Our experiments and extensive evaluations show that CLAWS achieves a competitive NMI score of 0.7325. Furthermore, CLAWS engenders the creation of low dimensional representations of very large datasets with minimal parameter tuning and forming well-defined clusters, which lends themselves to using efficient, transparent, and highly interpretable clustering methods such as Gaussian Mixture Models.
翻译:在计算机视野中,自监督的学习算法最近的进展利用了对比性学习方法,例如SimCLR等方法,将增强成成成成成成成成成成成成成成成成成成成成成成成成成成成成成成成成成成成两幅图之间的对比性损失最小化。在本文中,我们提出CLAWS,这是一个说明性有效的学习框架,处理人工标记大型农业数据集以及可能应用的应用程序的问题,如异常检测和植物生长分析等。CLAWS利用SimCLR和弱力监督的网络骨干来调查课堂群内对比学习对比性学习效果的效果。此外,我们进行实验和广泛评估后显示,CLAWS在最大成成成成成成成型的NMI评分中具有竞争力的0.732分数, 并且将CLAWS进行最有透明度的模型化成为高层次的模型。