We present a polynomial-time algorithm that discovers all maximal patterns in a point set, $D\in\mathbb{R}^k$, that are related by transformations in a user-specified class, $F$, of bijections over $\mathbb{R}^k$. We also present a second algorithm that discovers the set of occurrences for each of these maximal patterns and then uses compact encodings of these occurrence sets to compute a losslessly compressed encoding of the input point set. This encoding takes the form of a set of pairs, $E=\left\lbrace\left\langle P_1, T_1\right\rangle,\left\langle P_2, T_2\right\rangle,\ldots\left\langle P_{\ell}, T_{\ell}\right\rangle\right\rbrace$, where each $\langle P_i,T_i\rangle$ consists of a maximal pattern, $P_i\subseteq D$, and a set, $T_i\subset F$, of transformations that map $P_i$ onto other subsets of $D$. Each transformation is encoded by a vector of real values that uniquely identifies it within $F$ and the length of this vector is used as a measure of the complexity of $F$. We evaluate the new compression algorithm with three transformation classes of differing complexity, on the task of classifying folk-song melodies into tune families. The most complex of the classes tested includes all combinations of the musical transformations of transposition, inversion, retrograde, augmentation and diminution. We found that broadening the transformation class improved performance on this task. However, it did not, on average, improve compression factor, which may be due to the datasets (in this case, folk-song melodies) being too short and simple to benefit from the potentially greater number of pattern relationships that are discoverable with larger transformation classes.
翻译:我们提出一个多元时算法, 在一个点集中发现所有最大值模式, $D\ in\ mathb{R ⁇ k$, 由用户指定的类的变换相关, $F$, 比$mathb{R ⁇ k$的双向反射值多。 我们还提出第二个算法, 为这些最大值的每种模式发现一系列事件, 然后使用这些事件的缩放编码, 来计算输入点集的无损压缩编码。 这个编码的形式是一组复杂的对子, $E ⁇ left\ lbrace\left\langle P_ 1, T_ 1\right\rangle,\ fleft\langle, T_\\\\\right\kangle,\light\rbrace $, 其中每个$lightncregregal 的变换代号可能由一个最大值的变换数组成。 $P_ireflexdealdealdeal $, 这个变数的变数中, 这个变数由最高级变数的变数由我们的变数的变数中, 变数由最变数的变数在3个变数中。