In this manuscript, we propose a multiclass data description model based on kernel Mahalanobis distance (MDD-KM) with self-adapting hyperparameter setting. MDD-KM provides uncertainty quantification and can be deployed to build classification systems for the realistic scenario where out-of-distribution (OOD) samples are present among the test data. Given a test signal, a quantity related to empirical kernel Mahalanobis distance between the signal and each of the training classes is computed. Since these quantities correspond to the same reproducing kernel Hilbert space, they are commensurable and hence can be readily treated as classification scores without further application of fusion techniques. To set kernel parameters, we exploit the fact that predictive variance according to a Gaussian process (GP) is empirical kernel Mahalanobis distance when a centralized kernel is used, and propose to use GP's negative likelihood function as the cost function. We conduct experiments on the real problem of avian note classification. We report a prototypical classification system based on a hierarchical linear dynamical system with MDD-KM as a component. Our classification system does not require sound event detection as a preprocessing step, and is able to find instances of training avian notes with varying length among OOD samples (corresponding to unknown notes of disinterest) in the test audio clip. Domain knowledge is leveraged to make crisp decisions from raw classification scores. We demonstrate the superior performance of MDD-KM over possibilistic K-nearest neighbor.
翻译:在本手稿中,我们提议一个基于内部马哈拉诺比距离(MDD-KM)的多级数据描述模型(MDD-KM)的多级数据描述模型(MDD-KM),该模型具有自我调整超参数设置。MDD-KM提供不确定性量化,并可用于为测试数据中存在分配外(OOOD)样本的现实情景建立分类系统。根据测试信号,将计算出一个数量与实验内核马哈拉诺比斯在信号和每个训练班之间的距离有关。由于这些数量与原始复制核心希尔伯特空间(MDD-KM)的距离相同,因此可以随时作为分类分级分数处理,而无需进一步应用聚合技术。为了设定内核参数,我们利用以下事实:在使用中央内核内核时,预测性差(OOOOOOD-KM)的距离是实验性能,我们用直线性线性线系统进行实验,而机尾性能测试性能测试性能测试性能。我们用磁性记录系统进行不易变的磁性记录。