Partial label learning (PLL) is a typical weakly supervised learning problem, where each training example is associated with a set of candidate labels among which only one is true. Most existing PLL approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels and model the generation process of the candidate labels in a simple way. However, these approaches usually do not perform as well as expected due to the fact that the generation process of the candidate labels is always instance-dependent. Therefore, it deserves to be modeled in a refined way. In this paper, we consider instance-dependent PLL and assume that the generation process of the candidate labels could decompose into two sequential parts, where the correct label emerges first in the mind of the annotator but then the incorrect labels related to the feature are also selected with the correct label as candidate labels due to uncertainty of labeling. Motivated by this consideration, we propose a novel PLL method that performs Maximum A Posterior (MAP) based on an explicitly modeled generation process of candidate labels via decomposed probability distribution models. Extensive experiments on manually corrupted benchmark datasets and real-world datasets validate the effectiveness of the proposed method. Source code is available at https://github.com/palm-ml/idgp.
翻译:部分标签学习(PLL)是一个典型的薄弱监管的学习问题,因为每个培训范例都与一组候选人标签相关,其中只有一组标签是真实的。大多数现有的 PLL 方法假定,每个培训范例中的错误标签是随机挑选的,作为候选人标签的标签,并以简单的方式模拟候选人标签的生成过程。然而,由于候选人标签的生成过程总是依赖实例,这些方法通常没有发挥预期的效果。因此,它值得以精细化的方式进行模型化。在本文中,我们考虑依赖实例的 PLL,并假设候选人标签的生成过程可以分解成两个顺序部分,其中正确的标签首先出现在说明者脑中,然后与该特征有关的错误标签也被选用正确的标签作为候选人标签,原因是标签的不确定性。我们提出一种新型的PLLL方法,根据一个明确的模型生成候选人标签的过程,通过分解的概率分布模型,进行最大波斯里尔(MAP)的模型。关于手动腐蚀数据/源码的大规模实验,在httpsalviewalgalgalgals practalgest dalgs sal salgestalgestalget dalgard sals salget salget salget supolviewd dalddd das salgmalgs