In this paper, we consider the problem of making distributionally robust, skeptical inferences for the multi-label problem, or more generally for Boolean vectors. By distributionally robust, we mean that we consider a set of possible probability distributions, and by skeptical we understand that we consider as valid only those inferences that are true for every distribution within this set. Such inferences will provide partial predictions whenever the considered set is sufficiently big. We study in particular the Hamming loss case, a common loss function in multi-label problems, showing how skeptical inferences can be made in this setting. Our experimental results are organised in three sections; (1) the first one indicates the gain computational obtained from our theoretical results by using synthetical data sets, (2) the second one indicates that our approaches produce relevant cautiousness on those hard-to-predict instances where its precise counterpart fails, and (3) the last one demonstrates experimentally how our approach copes with imperfect information (generated by a downsampling procedure) better than the partial abstention [31] and the rejection rules.
翻译:在本文中, 我们考虑对多标签问题作出分布性强、 怀疑性的推论, 或者对布林矢量作更广义的推论。 通过分布性推论, 我们意味着我们考虑一系列可能的概率分布, 并且通过怀疑我们理解, 我们只认为这些推论对本集中每种分布都是真实的。 这种推论将在所考虑的集合足够大的情况下提供部分预测。 我们特别研究哈明损失案例, 多标签问题中常见的损失函数, 表明在这个环境中如何产生怀疑性推论。 我们的实验结果分为三部分:(1) 第一部分显示了通过使用合成数据集从我们的理论结果中获得的计算收益, (2) 第二部分表明我们的方法在精确对应数据失败的难以预测案例中产生了相关的谨慎。 (3) 最后一个推论以实验方式展示我们的方法如何应对不完善的信息( 由下标程序生成的), 比部分弃权 [31] 和拒绝规则更好。