Time-series data with missing values are commonly encountered in many fields, such as healthcare, meteorology, and robotics. The imputation aims to fill the missing values with valid values. Most imputation methods trained the models implicitly because missing values have no ground truth. In this paper, we propose Random Drop Imputation with Self-training (RDIS), a novel training method for time-series data imputation models. In RDIS, we generate extra missing values by applying a random drop on the observed values in incomplete data. We can explicitly train the imputation models by filling in the randomly dropped values. In addition, we adopt self-training with pseudo values to exploit the original missing values. To improve the quality of pseudo values, we set the threshold and filter them by calculating the entropy. To verify the effectiveness of RDIS on the time series imputation, we test RDIS to various imputation models and achieve competitive results on two real-world datasets.
翻译:缺少值的时间序列数据通常在很多领域,例如医疗、气象和机器人领域都遇到缺失值。 估算的目的是用有效值填充缺失值。 多数估算方法对模型进行隐含培训, 因为缺失值没有地面真相。 在本文中, 我们提出自我培训的随机滴记( RDIS ), 这是时间序列数据估算模型的新培训方法 。 在 RDIS 中, 我们通过对观察到的不完整数据值进行随机滴落来生成额外的缺失值 。 我们可以通过填充随机丢值来明确培训估算模型。 此外, 我们用假值进行自我培训, 以利用原始缺失值 。 为了提高伪值的质量, 我们设定了阈值, 并通过计算诱变器过滤它们 。 为了验证时间序列估算模型的实效, 我们测试 RDIS 以各种估算模型为测试, 并在两个真实世界数据集上取得竞争性的结果 。