Error Correcting Output Codes (ECOC) is a successful technique in multi-class classification, which is a core problem in Pattern Recognition and Machine Learning. A major advantage of ECOC over other methods is that the multi- class problem is decoupled into a set of binary problems that are solved independently. However, literature defines a general error-correcting capability for ECOCs without analyzing how it distributes among classes, hindering a deeper analysis of pair-wise error-correction. To address these limitations this paper proposes an Error-Correcting Factorization (ECF) method, our contribution is three fold: (I) We propose a novel representation of the error-correction capability, called the design matrix, that enables us to build an ECOC on the basis of allocating correction to pairs of classes. (II) We derive the optimal code length of an ECOC using rank properties of the design matrix. (III) ECF is formulated as a discrete optimization problem, and a relaxed solution is found using an efficient constrained block coordinate descent approach. (IV) Enabled by the flexibility introduced with the design matrix we propose to allocate the error-correction on classes that are prone to confusion. Experimental results in several databases show that when allocating the error-correction to confusable classes ECF outperforms state-of-the-art approaches.
翻译:校正输出代码( ECOC) 是多级分类的成功技术, 这是模式识别和机器学习中的一个核心问题。 ECOC与其他方法相比的一个主要优势是, 多级问题被分解成一组独立解决的二进制问题。 但是, 文献定义了 EECC 的一般错误更正能力, 但没有分析它如何在类中分配, 妨碍了对对对错误校正的更深入分析。 为解决这些限制, 本文建议了一种错误校正因子化( ECF) 方法, 我们的贡献是三叠:( I) 我们建议对错误校正能力进行新的表述, 称为设计矩阵, 使我们能够在对类进行校正的基础上建立一个 EECC 。 (II) 我们用设计矩阵的等级属性来计算 EECC 的最佳代码长度, 因为它是一个离散式的优化问题, 使用高效的制约区块协调世系( ECF) 方法找到一种宽松的解决方案。 (IV) 由于设计矩阵引入了灵活性, 我们建议将错误校正性分类方法分配为弹性, 在易变式的类中, 级中, 我们建议将误判误差式数据库显示所有易变式数据库。