过拟合,在AI领域多指机器学习得到模型太过复杂,导致在训练集上表现很好,然而在测试集上却不尽人意。过拟合(over-fitting)也称为过学习,它的直观表现是算法在训练集上表现好,但在测试集上表现不好,泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差,在训练时复杂的模型将抽样误差也进行了拟合导致的。

VIP内容

题目: Time Series Data Augmentation for Deep Learning: A Survey

摘要:

近年来,深度学习在许多时间序列分析任务中表现优异。深度神经网络的优越性能很大程度上依赖于大量的训练数据来避免过拟合。然而,许多实际时间序列应用的标记数据可能会受到限制,如医学时间序列的分类和AIOps中的异常检测。数据扩充是提高训练数据规模和质量的有效途径,是深度学习模型在时间序列数据上成功应用的关键。本文系统地综述了时间序列的各种数据扩充方法。我们为这些方法提出了一个分类,然后通过强调它们的优点和局限性为这些方法提供了一个结构化的审查。并对时间序列异常检测、分类和预测等不同任务的数据扩充方法进行了实证比较。最后,我们讨论并强调未来的研究方向,包括时频域的数据扩充、扩充组合、不平衡类的数据扩充与加权。

成为VIP会员查看完整内容
0
96

最新论文

The recursive and hierarchical structure of full rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning. In most of these cases, the full rooted tree is not a random variable; as such, model selection to avoid overfitting becomes problematic. A method to solve this problem is to assume a prior distribution on the full rooted trees. This enables the optimal model selection based on the Bayes decision theory. For example, by assigning a low prior probability to a complex model, the maximum a posteriori estimator prevents the selection of the complex one. Furthermore, we can average all the models weighted by their posteriors. In this paper, we propose a probability distribution on a set of full rooted trees. Its parametric representation is suitable for calculating the properties of our distribution using recursive functions, such as the mode, expectation, and posterior distribution. Although such distributions have been proposed in previous studies, they are only applicable to specific applications. Therefore, we extract their mathematically essential components and derive new generalized methods to calculate the expectation, posterior distribution, etc.

0
0
下载
预览
参考链接
Top
微信扫码咨询专知VIP会员