Despite the recent advances in out-of-distribution(OOD) detection, anomaly detection, and uncertainty estimation tasks, there do not exist a task-agnostic and post-hoc approach. To address this limitation, we design a novel clustering-based ensembling method, called Task Agnostic and Post-hoc Unseen Distribution Detection (TAPUDD) that utilizes the features extracted from the model trained on a specific task. Explicitly, it comprises of TAP-Mahalanobis, which clusters the training datasets' features and determines the minimum Mahalanobis distance of the test sample from all clusters. Further, we propose the Ensembling module that aggregates the computation of iterative TAP-Mahalanobis for a different number of clusters to provide reliable and efficient cluster computation. Through extensive experiments on synthetic and real-world datasets, we observe that our approach can detect unseen samples effectively across diverse tasks and performs better or on-par with the existing baselines. To this end, we eliminate the necessity of determining the optimal value of the number of clusters and demonstrate that our method is more viable for large-scale classification tasks.
翻译:尽管最近在分配外探测、异常探测和不确定性估计任务方面有所进展,但目前还没有一种任务不可知性和后热度方法。为了应对这一限制,我们设计了一种新的基于集群的集合方法,称为Agnistic和Ho-post-unseen 分布探测(TAPIDD),该方法利用从经过特定任务培训的模型中提取的特征。它由TAP-Mahalanobis组成,该方法将培训数据集的特征集中起来,并确定所有组群测试样品的最低马哈拉诺比距离。此外,我们建议采用组合模块,将迭代TAP-Mahalanobis的计算方法汇总到不同组群群中,以便提供可靠和有效的集计算。通过对合成和真实世界数据集的广泛实验,我们发现我们的方法可以有效地探测到各种任务中的未见样品,并且与现有基线相比,更好或更接近。为此,我们消除了确定组群数量的最佳价值的必要性,并证明我们的方法对于大规模分类任务更为可行。