Deep learning performs remarkably well on many time series analysis tasks recently. The superior performance of deep neural networks relies heavily on a large number of training data to avoid overfitting. However, the labeled data of many real-world time series applications may be limited such as classification in medical time series and anomaly detection in AIOps. As an effective way to enhance the size and quality of the training data, data augmentation is crucial to the successful application of deep learning models on time series data. In this paper, we systematically review different data augmentation methods for time series. We propose a taxonomy for the reviewed methods, and then provide a structured review for these methods by highlighting their strengths and limitations. We also empirically compare different data augmentation methods for different tasks including time series anomaly detection, classification, and forecasting. Finally, we discuss and highlight five future directions to provide useful research guidance.
翻译:深层神经网络的优异性能严重依赖大量培训数据以避免过度匹配。然而,许多真实世界时间序列应用的标签数据可能有限,如医疗时间序列的分类和AIOps异常现象探测。作为提高培训数据规模和质量的有效方法,数据扩增对于在时间序列数据中成功应用深层学习模型至关重要。在本文件中,我们系统地审查时间序列中不同的数据扩增方法。我们建议为所审查的方法进行分类,然后通过突出这些方法的优点和局限性,对这些方法进行结构化审查。我们还对不同任务的不同数据扩增方法进行了经验性比较,包括时间序列异常检测、分类和预测。最后,我们讨论和强调五个未来方向,以提供有用的研究指导。