黑盒内潜入:解释Exopplanenet大气回收的深学习模型 (Peeking inside the Black Box: Interpreting Deep Learning Models for Exoplanet Atmospheric Retrievals)

Deep learning algorithms are growing in popularity in the field of exoplanetary science due to their ability to model highly non-linear relations and solve interesting problems in a data-driven manner. Several works have attempted to perform fast retrievals of atmospheric parameters with the use of machine learning algorithms like deep neural networks (DNNs). Yet, despite their high predictive power, DNNs are also infamous for being 'black boxes'. It is their apparent lack of explainability that makes the astrophysics community reluctant to adopt them. What are their predictions based on? How confident should we be in them? When are they wrong and how wrong can they be? In this work, we present a number of general evaluation methodologies that can be applied to any trained model and answer questions like these. In particular, we train three different popular DNN architectures to retrieve atmospheric parameters from exoplanet spectra and show that all three achieve good predictive performance. We then present an extensive analysis of the predictions of DNNs, which can inform us - among other things - of the credibility limits for atmospheric parameters for a given instrument and model. Finally, we perform a perturbation-based sensitivity analysis to identify to which features of the spectrum the outcome of the retrieval is most sensitive. We conclude that for different molecules, the wavelength ranges to which the DNN's predictions are most sensitive, indeed coincide with their characteristic absorption regions. The methodologies presented in this work help to improve the evaluation of DNNs and to grant interpretability to their predictions.

翻译：深层学习算法在外行星科学领域越来越受欢迎,因为它们有能力模拟高度非线性关系,并以数据驱动的方式解决有趣的问题。一些作品试图利用深神经网络(DNNS)等机器学习算法快速检索大气参数。然而,尽管其预测力很高,但DNNS也以“黑盒”为名声显赫。它们显然缺乏解释性,使得天体物理学界不愿意采用这些预测。它们的预测基于什么?我们应如何相信它们?当它们错误的时候,它们又会如何错误?在这个工作中,我们提出了一些一般的评价方法,可以应用于任何经过训练的模型和回答这些问题。特别是,我们训练了三种不同的流行的DNNNN结构,以从外板光谱中检索大气参数,表明所有这三种系统都取得了良好的预测性表现。我们随后对DNNWs的预测进行了广泛的分析,这些预测可以帮助我们了解这些预测如何帮助—— 当它们错误的时候,它们又会如何相信它们会错呢?在这个工作中,我们提出了一些一般的评价方法,我们通过一种不同的分析来得出最敏感的仪器和最敏感的模型的精确的精确的频率。我们每个分析。我们用它能的频率来分析。