We study variants of the mean problem under the $p$-Dynamic Time Warping ($p$-DTW) distance, a popular and robust distance measure for sequential data. In our setting we are given a set of finite point sequences over an arbitrary metric space and we want to compute a mean point sequence of given length that minimizes the sum of $p$-DTW distances, each raised to the $q$\textsuperscript{th} power, between the input sequences and the mean sequence. In general, the problem is $\mathrm{NP}$-hard and known not to be fixed-parameter tractable in the number of sequences. On the positive side, we show that restricting the length of the mean sequence significantly reduces the hardness of the problem. We give an exact algorithm running in polynomial time for constant-length means. We explore various approximation algorithms that provide a trade-off between the approximation factor and the running time. Our approximation algorithms have a running time with only linear dependency on the number of input sequences. In addition, we use our mean algorithms to obtain clustering algorithms with theoretical guarantees.
翻译:我们研究在美元- 动态时间转换( p$- DTW) 距离下的平均问题的变体,这是对相继数据的一种流行和稳健的距离测量。 在我们的设置中,我们得到一套任意的计量空间的定点序列,我们想要计算一个平均点序列的给定长度,以尽量减少美元- DTW 距离的总和,每段长度都升至$q$\ text上下标{th},在输入序列和平均序列之间。一般来说,问题在于美元- mathrm{NP} $- 硬和已知在序列数量上无法固定的参数。在正面,我们显示限制平均序列的长度会大大降低问题的硬性。我们给定出一个精确的算法,在常长手段的多元时间运行中进行计算。我们探索各种近似算法,在近似因素和运行时间之间提供一种交易。我们的近似算法在运行时,输入序列数上只有线性依赖。 此外,我们用我们的平均算法来获得带有理论保证的组合算法。