A new recursive construction of $N$-ary error-correcting output code (ECOC) matrices for ensemble classification methods is presented, generalizing the classic doubling construction for binary Hadamard matrices. Given any prime integer $N$, this deterministic construction generates base-$N$ symmetric square matrices $M$ of prime-power dimension having optimal minimum Hamming distance between any two of its rows and columns. Experimental results for six datasets demonstrate that using these deterministic coding matrices for $N$-ary ECOC classification yields comparable and in many cases higher accuracy compared to using randomly generated coding matrices. This is particular true when $N$ is adaptively chosen so that the dimension of $M$ matches closely with the number of classes in a dataset, which reduces the loss in minimum Hamming distance when $M$ is truncated to fit the dataset. This is verified through a distance formula for $M$ which shows that these adaptive matrices have significantly higher minimum Hamming distance in comparison to randomly generated ones.
翻译:介绍了用于混合分类方法的新的重复构造值为美元错误校正产出代码(ECOC)矩阵,将典型的二进制哈达马德矩阵翻一番的构造概括化。考虑到任何质整美元,这种确定性建筑产生基值-N美元对称平方基矩阵$M$,其正能量尺寸在任何两行和列之间具有最佳最小升降距离。六个数据集的实验结果显示,使用这些确定性编码矩阵,用于美元埃纳克分类,其产量与随机生成的编码矩阵相比是可比的,在许多情况下,其准确性要高得多。这在以下情况下尤其如此:以适应性方式选择了美元,使美元尺寸与数据集中的分类数量密切匹配,从而降低了当美元被冲出以适应数据集时在最小升降距离上的损失。这通过一个美元的距离公式得到验证,该公式表明,这些适应性组合与随机生成的模型相比,其含水量距离要高得多。