Consider traffic data (i.e., triplets in the form of source-destination-timestamp) that grow over time. Tensors (i.e., multi-dimensional arrays) with a time mode are widely used for modeling and analyzing such multi-aspect data streams. In such tensors, however, new entries are added only once per period, which is often an hour, a day, or even a year. This discreteness of tensors has limited their usage for real-time applications, where new data should be analyzed instantly as it arrives. How can we analyze time-evolving multi-aspect sparse data 'continuously' using tensors where time is'discrete'? We propose SLICENSTITCH for continuous CANDECOMP/PARAFAC (CP) decomposition, which has numerous time-critical applications, including anomaly detection, recommender systems, and stock market prediction. SLICENSTITCH changes the starting point of each period adaptively, based on the current time, and updates factor matrices (i.e., outputs of CP decomposition) instantly as new data arrives. We show, theoretically and experimentally, that SLICENSTITCH is (1) 'Any time': updating factor matrices immediately without having to wait until the current time period ends, (2) Fast: with constant-time updates up to 464x faster than online methods, and (3) Accurate: with fitness comparable (specifically, 72 ~ 100%) to offline methods.
翻译:考虑随时间增长的交通量数据( 即以源- 目的地- 时间戳为形式的三进制三进制) 。 有时间模式的电流( 即多维阵列) 被广泛用于模拟和分析这样的多层数据流。 然而, 在这种 Exors 中, 新的条目每期只添加一次, 通常为一小时、 一天甚至一年。 粒子的这种离散性限制了实时应用程序的用途, 在那里, 新的数据应随时间的到立即分析 。 我们如何用“ 偏差” 的时间模式分析时流多层稀释数据 。 我们建议 SLISTITCH 进行连续的 CANDECOMP/ PARAFAC (CP) 分解处理, 它有无数的时间- 关键应用程序, 包括异常检测、 推荐系统 和 股票市场预测 。 SLICENSTCHTCH 以当前时间为基础, 更新每个周期的起始点点点的起始点, 。 (i. e. c. CP developtal daltimetime) 和 Ciral stillstal 止 。