This paper studies the problem of learning with augmented classes (LAC), where augmented classes unobserved in the training data might emerge in the testing phase. Previous studies generally attempt to discover augmented classes by exploiting geometric properties, achieving inspiring empirical performance yet lacking theoretical understandings particularly on the generalization ability. In this paper we show that, by using unlabeled training data to approximate the potential distribution of augmented classes, an unbiased risk estimator of the testing distribution can be established for the LAC problem under mild assumptions, which paves a way to develop a sound approach with theoretical guarantees. Moreover, the proposed approach can adapt to complex changing environments where augmented classes may appear and the prior of known classes may change simultaneously. Extensive experiments confirm the effectiveness of our proposed approach.
翻译:本文研究的是扩大班级(拉加)的学习问题,在这些班级中,培训数据中可能未见的扩大班级可能出现在测试阶段。以前的研究一般都试图通过利用几何特性发现扩大班级,取得了鼓舞人心的经验性表现,但缺乏理论上的理解,特别是对一般化能力的理论性理解。在本文中,我们表明,通过使用未贴标签的培训数据来接近扩大班级的潜在分布,可以根据温和的假设为拉加问题建立无偏见的测试分配风险估计标准,这为制定有理论保障的健全方法铺平了道路。此外,拟议的方法可以适应复杂的变化环境,因为增加班级可能会出现,而已知班级的先前班级可能同时发生变化。广泛的实验证实了我们拟议方法的有效性。