Consider multiple seasonal time series being collected in real-time, in the form of a tensor stream. Real-world tensor streams often include missing entries (e.g., due to network disconnection) and at the same time unexpected outliers (e.g., due to system errors). Given such a real-world tensor stream, how can we estimate missing entries and predict future evolution accurately in real-time? In this work, we answer this question by introducing SOFIA, a robust factorization method for real-world tensor streams. In a nutshell, SOFIA smoothly and tightly integrates tensor factorization, outlier removal, and temporal-pattern detection, which naturally reinforce each other. Moreover, SOFIA integrates them in linear time, in an online manner, despite the presence of missing entries. We experimentally show that SOFIA is (a) robust and accurate: yielding up to 76% lower imputation error and 71% lower forecasting error; (b) fast: up to 935X faster than the second-most accurate competitor; and (c) scalable: scaling linearly with the number of new entries per time step.
翻译:考虑实时收集多个季节性时间序列, 以 shor 流的形式 。 真实世界的 稀疏流通常包含缺失的参数( 例如, 由于网络断开), 同时包括意外的外向( 例如, 由于系统错误 ) 。 鉴于这样的真实世界 虫流, 我们如何估算缺失的条目, 并准确预测实时的未来演变? 在这项工作中, 我们通过引入 SOFIA 来回答这个问题, SOFIA 是真实世界 Exor 流的一种稳健的系数化方法 。 简而言之, SOFIA 平稳和紧紧紧地整合了 Exlor 系数化、 外部移除 和 时间- 模式 检测, 而这些元素自然会相互强化 。 此外, SOFIA 以线性方式在线性时间上整合它们, 尽管有缺失的条目存在 。 我们实验性地显示 SOFIA 是 (a) 强大和准确的: 产生高达 76% 的浸透误 和 71% 的预报错误 ; (b) 快速 : : 至 935X 速度比第二 最精确的相匹配的相匹配的 ; ; (c) 可缩缩缩缩 : 时间: : 直线性: 每步骤: 每步 。