We use a contrastive self-supervised learning framework to estimate distances to galaxies from their photometric images. We incorporate data augmentations from computer vision as well as an application-specific augmentation accounting for galactic dust. We find that the resulting visual representations of galaxy images are semantically useful and allow for fast similarity searches, and can be successfully fine-tuned for the task of redshift estimation. We show that (1) pretraining on a large corpus of unlabeled data followed by fine-tuning on some labels can attain the accuracy of a fully-supervised model which requires 2-4x more labeled data, and (2) that by fine-tuning our self-supervised representations using all available data labels in the Main Galaxy Sample of the Sloan Digital Sky Survey (SDSS), we outperform the state-of-the-art supervised learning method.
翻译:我们使用对比式的自我监督学习框架来估计星系与光度图像的距离。 我们从计算机视野中引入了数据增强功能,并采用了具体应用的增强功能来计算银尘。 我们发现,由此得出的星系图像的视觉显示方式在语义上是有用的,可以进行快速相似的搜索,并且可以成功地为红地估计任务进行微调。 我们显示:(1) 对大量无标签数据进行预先培训,然后对某些标签进行微调,可以达到完全监督的模型的准确性,该模型需要2-4x更多的标签数据,以及(2) 通过使用斯隆数字天空测量(SDSS)主要银河样板中所有可用的数据标签来微调我们的自我监督的表示方式,我们比目前所监督的学习方法要好。