A multi-label classifier estimates the binary label state (relevant vs irrelevant) for each of a set of concept labels, for any given instance. Probabilistic multi-label classifiers provide a predictive posterior distribution over all possible labelset combinations of such label states (the powerset of labels) from which we can provide the best estimate, simply by selecting the labelset corresponding to the largest expected accuracy, over that distribution. For example, in maximizing exact match accuracy, we provide the mode of the distribution. But how does this relate to the confidence we may have in such an estimate? Confidence is an important element of real-world applications of multi-label classifiers (as in machine learning in general) and is an important ingredient in explainability and interpretability. However, it is not obvious how to provide confidence in the multi-label context and relating to a particular accuracy metric, and nor is it clear how to provide a confidence which correlates well with the expected accuracy, which would be most valuable in real-world decision making. In this article we estimate the expected accuracy as a surrogate for confidence, for a given accuracy metric. We hypothesise that the expected accuracy can be estimated from the multi-label predictive distribution. We examine seven candidate functions for their ability to estimate expected accuracy from the predictive distribution. We found three of these to correlate to expected accuracy and are robust. Further, we determined that each candidate function can be used separately to estimate Hamming similarity, but a combination of the candidates was best for expected Jaccard index and exact match.
翻译:多标签分类器对每套概念标签的二进制标签状态(相关与不相关)进行估算。 概率多标签分类器为这些标签状态的所有可能的标签组合(标签的功率组)提供预测的后遗物分布, 我们从中可以提供最佳估计, 只需选择与该分布的预期最大准确度相对应的标签集即可。 例如, 在尽可能精确匹配的准确度方面, 我们提供分布模式。 但是, 这与我们对这种估计的信心有何关系? 多标签分类器的概率是( 机器一般学习时) 真实应用多标签分类器的一个重要元素, 并且是解释和解释这些标签状态所有可能的标签组合( 标签的功率组) 的重要成份。 然而, 如何在多标签背景下提供与最大预期准确度相对应的标签集, 也不清楚如何提供与预期准确性相符的信任, 这在现实- 组合决策中最有价值。 在文章中,我们估计预期的准确度是作为信心的替代值, 多标签分类器的精确度应用到预估测的预估定的预估值 。 我们用预估的预估的预估的精确度值值值值到预估值值值值值 。