We propose a novel semiparametric classifier based on Mahalanobis distances of an observation from the competing classes. Our tool is a generalized additive model with the logistic link function that uses these distances as features to estimate the posterior probabilities of different classes. While popular parametric classifiers like linear and quadratic discriminant analyses are mainly motivated by the normality of the underlying distributions, the proposed classifier is more flexible and free from such parametric modeling assumptions. Since the densities of elliptic distributions are functions of Mahalanobis distances, this classifier works well when the competing classes are (nearly) elliptic. In such cases, it often outperforms popular nonparametric classifiers, especially when the sample size is small compared to the dimension of the data. To cope with non-elliptic and possibly multimodal distributions, we propose a local version of the Mahalanobis distance. Subsequently, we propose another classifier based on a generalized additive model that uses the local Mahalanobis distances as features. This nonparametric classifier usually performs like the Mahalanobis distance based semiparametric classifier when the underlying distributions are elliptic, but outperforms it for several non-elliptic and multimodal distributions. We also investigate the behaviour of these two classifiers in high dimension, low sample size situations. A thorough numerical study involving several simulated and real datasets demonstrate the usefulness of the proposed classifiers in comparison to many state-of-the-art methods.
翻译:暂无翻译