With increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects (that can naturally occur in the data processing and analysis pipelines) and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23% on data with higher observational noise. Domain adaptation also increases by a factor of ~2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations.
翻译:随着在处理和分析宇宙测量数据方面更多地采用受监督的深层学习方法,评估数据扰动效应(这在数据处理和分析管道中自然会发生)和开发提高模型稳健性的方法,越来越重要。在对星系的形态分类方面,我们研究图像数据中扰动的影响。特别是,我们研究在进行基线数据培训和对扰动数据测试时使用神经网络的后果。我们考虑与两个主要来源有关的扰动:1)比普瓦森噪音高水平所显示的观测噪音增加,2)图像压缩或望远镜错误等步骤引起的数据处理噪音增加。我们还测试了区域适应技术在减轻扰动驱动错误方面的功效。我们使用分类准确性、潜伏空间可视化和潜伏空间距离来评估模型稳健性。我们发现,处理像素级错误很容易将分类转换成不正确的类别,而较高的观测噪音使低度基线数据模型经过培训,使图像压缩或望远镜错误性差错动,例如图像压缩或望远镜的精确性错误压缩,我们用系统稳健性分类法进行其他的精确性培训,从而改进了对银河系统进行更精确性分类的精确性评估。