Self-supervised learning has proved to be a powerful approach to learn image representations without the need of large labeled datasets. For underwater robotics, it is of great interest to design computer vision algorithms to improve perception capabilities such as sonar image classification. Due to the confidential nature of sonar imaging and the difficulty to interpret sonar images, it is challenging to create public large labeled sonar datasets to train supervised learning algorithms. In this work, we investigate the potential of three self-supervised learning methods (RotNet, Denoising Autoencoders, and Jigsaw) to learn high-quality sonar image representation without the need of human labels. We present pre-training and transfer learning results on real-life sonar image datasets. Our results indicate that self-supervised pre-training yields classification performance comparable to supervised pre-training in a few-shot transfer learning setup across all three methods. Code and self-supervised pre-trained models are be available at https://github.com/agrija9/ssl-sonar-images
翻译:自我监督的学习证明是一种强大的方法,可以在不需要大标签数据集的情况下学习图像表达方式。 对于水下机器人来说,设计计算机视觉算法以提高感知能力,例如声纳图像分类。由于声纳成像的保密性质和对声纳图像的解释困难,创建公共标签的声纳数据集以培训受监督的学习算法具有挑战性。在这项工作中,我们调查三种自我监督的学习方法(RotNet、Denoising Autoencoders和Jigsaw)的潜力,以学习高质量的声纳图像表达方式而无需人类标签。我们在真实的声纳图像数据集中提供预培训和转移学习结果。我们的结果表明,自我监督的训练前生成的分类性能与在所有三种方法中监督的几张前传输学习设置中受监督的培训性相似。可在https://github.com/agrija9/sl-sonarimages查阅守则和自监督的预培训模式。