It has recently been demonstrated that deep learning has significant potential to automate parts of the exoplanet detection pipeline using light curve data from satellites such as Kepler \cite{borucki2010kepler} \cite{koch2010kepler} and NASA's Transiting Exoplanet Survey Satellite (TESS) \cite{ricker2010transiting}. Unfortunately, the smallness of the available datasets makes it difficult to realize the level of performance one expects from powerful network architectures. In this paper, we investigate the use of data augmentation techniques on light curve data from to train neural networks to identify exoplanets. The augmentation techniques used are of two classes: Simple (e.g. additive noise augmentation) and learning-based (e.g. first training a GAN \cite{goodfellow2020generative} to generate new examples). We demonstrate that data augmentation has a potential to improve model performance for the exoplanet detection problem, and recommend the use of augmentation based on generative models as more data becomes available.
翻译:最近已经证明,利用开普勒\cite{borucki2010kepler}\cite{koch2010kepler}和NASA的中转Explane调查卫星(TESS)\cite{ricker2010transit}等卫星提供的光曲线数据,深层次的学习有很大潜力使外行星探测管道的部件自动化。遗憾的是,现有数据集的狭小使得难以实现强大的网络结构所期望的性能水平。在本文中,我们调查使用光曲线数据增强技术从培养神经网络来识别外行星。所使用的增强技术分为两类:简单(例如添加噪声增强)和基于学习的(例如,首次培训GAN\cite{goodfellow2020generative}来生成新的实例)。我们证明,数据增强有可能改进外行星探测问题的模型性能,并建议在获得更多数据时使用基于基因化模型的增强。