The centrality and diversity of the labeled data are very influential to the performance of semi-supervised learning (SSL), but most SSL models select the labeled data randomly. How to guarantee the centrality and diversity of the labeled data has so far received little research attention. Optimal leading forest (OLF) has been observed to have the advantage of revealing the difference evolution within a class when it was utilized to develop an SSL model. Our key intuition of this study is to learn a kernelized large margin metric for a small amount of most stable and most divergent data that are recognized based on the OLF structure. An optimization problem is formulated to achieve this goal. Also with OLF the multiple local metrics learning is facilitated to address multi-modal and mix-modal problem in SSL. Attribute to this novel design, the accuracy and performance stableness of the SSL model based on OLF is significantly improved compared with its baseline methods without sacrificing much efficiency. The experimental studies have shown that the proposed method achieved encouraging accuracy and running time when compared to the state-of-the-art graph SSL methods. Code has been made available at https://github.com/alanxuji/DeLaLA.
翻译:标签数据的核心性和多样性对于半监督学习(SSL)的绩效影响很大,但大多数SSL模型随机选择了标签数据。迄今为止,如何保证标签数据的核心性和多样性很少引起研究关注。观察到最佳领先森林(OLF)的优点是,在利用它来开发SSL模型时,能够揭示一个类别内部的差异演变。我们这项研究的关键直觉是,为根据OLF结构确认的少量最稳定、最有差异的数据学习一个内核化的大差值。为了实现这一目标,制定了一个优化问题。在OLF的情况下,多种本地指标学习也得到了促进,以解决SSL的多模式和混合模式问题。对这一新颖的设计的特性,基于OSLF的SSL模型的准确性和性能稳定性与基线方法相比大大提高,同时又不牺牲效率。实验研究表明,拟议的方法在与SLSLSL的状态图表方法相比,实现了鼓励准确性和运行时间。在 https://Lax/Laxub/Lax/Lax/Lacub.deal上提供了代码。