In many real-world pattern recognition scenarios, such as in medical applications, the corresponding classification tasks can be of an imbalanced nature. In the current study, we focus on binary, imbalanced classification tasks, i.e.~binary classification tasks in which one of the two classes is under-represented (minority class) in comparison to the other class (majority class). In the literature, many different approaches have been proposed, such as under- or oversampling, to counter class imbalance. In the current work, we introduce a novel method, which addresses the issues of class imbalance. To this end, we first transfer the binary classification task to an equivalent regression task. Subsequently, we generate a set of negative and positive target labels, such that the corresponding regression task becomes balanced, with respect to the redefined target label set. We evaluate our approach on a number of publicly available data sets in combination with Support Vector Machines. Moreover, we compare our proposed method to one of the most popular oversampling techniques (SMOTE). Based on the detailed discussion of the presented outcomes of our experimental evaluation, we provide promising ideas for future research directions.
翻译:在许多现实世界模式识别假设中,例如在医疗应用中,相应的分类任务可能具有不平衡的性质。在目前的研究中,我们侧重于二进制、不平衡的分类任务,即两类中的某一类与其他类(多数类)相比代表性不足(少数类)的~二进制分类任务。在文献中,提出了许多不同的方法,如低或过度抽样,以对抗阶级不平衡。在目前的工作中,我们引入了一种新颖的方法,解决阶级不平衡的问题。为此,我们首先将二进制分类任务转移到了同等的回归任务。随后,我们产生了一套消极和积极的目标标签,使相应的回归任务在重新界定的目标标签方面变得平衡。我们评估了与支持矢量机(Support Victor Magis)相结合的关于一些公开可用的数据集的方法。此外,我们将我们所提议的方法与最受欢迎的过度抽样技术(SMOTE)之一进行了比较。我们根据对实验评估结果的详细讨论,为今后的研究方向提供了有希望的想法。