We study statistical inference on the similarity/distance between two time-series under uncertain environment by considering a statistical hypothesis test on the distance obtained from Dynamic Time Warping (DTW) algorithm. The sampling distribution of the DTW distance is too difficult to derive because it is obtained based on the solution of the DTW algorithm, which is complicated. To circumvent this difficulty, we propose to employ the conditional selective inference framework, which enables us to derive a valid inference method on the DTW distance. To our knowledge, this is the first method that can provide a valid p-value to quantify the statistical significance of the DTW distance, which is helpful for high-stake decision making such as abnormal time-series detection problems. We evaluate the performance of the proposed inference method on both synthetic and real-world datasets.
翻译:我们研究在不确定环境中两个时间序列之间的相似/距离的统计推论,方法是考虑对动态时间扭曲算法的距离进行统计假设测试。DTW距离的抽样分布太难得出,因为它是根据复杂的DTW算法的解决方法获得的。为了避免这一困难,我们提议采用有条件的选择性推论框架,使我们能够在DTW距离上得出有效的推论方法。据我们了解,这是第一种能够提供有效的p值的方法,以量化DTW距离的统计意义,这对高取量决策如异常的时间序列探测问题很有帮助。我们评估了拟议的合成和现实世界数据集的推断方法的绩效。