Irregularly sampled time series data arise naturally in many application domains including biology, ecology, climate science, astronomy, and health. Such data represent fundamental challenges to many classical models from machine learning and statistics due to the presence of non-uniform intervals between observations. However, there has been significant progress within the machine learning community over the last decade on developing specialized models and architectures for learning from irregularly sampled univariate and multivariate time series data. In this survey, we first describe several axes along which approaches differ including what data representations they are based on, what modeling primitives they leverage to deal with the fundamental problem of irregular sampling, and what inference tasks they are designed to perform. We then survey the recent literature organized primarily along the axis of modeling primitives. We describe approaches based on temporal discretization, interpolation, recurrence, attention, and structural invariance. We discuss similarities and differences between approaches and highlight primary strengths and weaknesses.
翻译:在许多应用领域,包括生物学、生态学、气候科学、天文学和健康领域,不定期抽样的时间序列数据自然产生。这些数据对来自机器学习和统计的许多古典模型构成根本性挑战,因为观测间隔不统一。然而,过去十年来,机器学习界在开发专门模型和结构以从非常规抽样的单体和多变时间序列数据中学习方面取得了显著进展。在这次调查中,我们首先描述了几种不同的方法,包括它们基于哪些数据说明,它们利用哪些原始模型处理非正常取样的基本问题,以及它们设计要执行哪些推论任务。我们接着调查最近主要围绕建模原始中心组织的文献。我们描述了基于时间分解、内插、重复、注意力和结构差异的方法。我们讨论了方法之间的相似性和差异,并着重指出了主要的强项和弱点。