With the development of a series of Galaxy sky surveys in recent years, the observations increased rapidly, which makes the research of machine learning methods for galaxy image recognition a hot topic. Available automatic galaxy image recognition researches are plagued by the large differences in similarity between categories, the imbalance of data between different classes, and the discrepancy between the discrete representation of Galaxy classes and the essentially gradual changes from one morphological class to the adjacent class (DDRGC). These limitations have motivated several astronomers and machine learning experts to design projects with improved galaxy image recognition capabilities. Therefore, this paper proposes a novel learning method, ``Hierarchical Imbalanced data learning with Weighted sampling and Label smoothing" (HIWL). The HIWL consists of three key techniques respectively dealing with the above-mentioned three problems: (1) Designed a hierarchical galaxy classification model based on an efficient backbone network; (2) Utilized a weighted sampling scheme to deal with the imbalance problem; (3) Adopted a label smoothing technique to alleviate the DDRGC problem. We applied this method to galaxy photometric images from the Galaxy Zoo-The Galaxy Challenge, exploring the recognition of completely round smooth, in between smooth, cigar-shaped, edge-on and spiral. The overall classification accuracy is 96.32\%, and some superiorities of the HIWL are shown based on recall, precision, and F1-Score in comparing with some related works. In addition, we also explored the visualization of the galaxy image features and model attention to understand the foundations of the proposed scheme.
翻译:随着近年来一系列银河星系天空调查的发展,观测迅速增加,使得对星系图像识别的机器学习方法的研究成为一个热题。现有的自动银河星系图像识别研究受到不同类别之间相似性的巨大差异、不同类别之间数据不平衡以及银河系各等级分立和基本上从一个形态类向相邻类(DDDGC)的逐渐变化之间的差异的困扰。这些限制促使数名天文学家和机器学习专家设计项目,提高银河图像识别能力。因此,本文件提出一种新的学习方法,即“高压平衡数据学习与加权取样和拉贝尔平滑”(HIWL)。 HIWL由三种关键技术组成,分别处理上述三个问题:(1) 设计了一个基于高效骨干网络的星系分级分类模型;(2) 利用一个加权抽样计划来解决不平衡问题;(3) 采用一种标志性平滑动的模型来缓解DRGC问题。我们用这种方法对银河星系 Zoo-银河系统挑战(HIL)的光学基础的光学数据学习方法,探索了96级抽样和拉贝尔(HWL)的测平、直径级和直视系统对比的准确性分析,在平平平、平平平平平平的系统分类中展示和平平平平平平平的计算中显示和比工作上显示和直平平平平平的比较。