The pairwise objective paradigms are an important and essential aspect of machine learning. Examples of machine learning approaches that use pairwise objective functions include differential network in face recognition, metric learning, bipartite learning, multiple kernel learning, and maximizing of area under the curve (AUC). Compared to pointwise learning, pairwise learning's sample size grows quadratically with the number of samples and thus its complexity. Researchers mostly address this challenge by utilizing an online learning system. Recent research has, however, offered adaptive sample size training for smooth loss functions as a better strategy in terms of convergence and complexity, but without a comprehensive theoretical study. In a distinct line of research, importance sampling has sparked a considerable amount of interest in finite pointwise-sum minimization. This is because of the stochastic gradient variance, which causes the convergence to be slowed considerably. In this paper, we combine adaptive sample size and importance sampling techniques for pairwise learning, with convergence guarantees for nonsmooth convex pairwise loss functions. In particular, the model is trained stochastically using an expanded training set for a predefined number of iterations derived from the stability bounds. In addition, we demonstrate that sampling opposite instances at each iteration reduces the variance of the gradient, hence accelerating convergence. Experiments on a broad variety of datasets in AUC maximization confirm the theoretical results.
翻译:双向客观范式是机器学习的一个重要和必不可少的方面。使用双向客观功能的机器学习方法的例子包括面部识别、计量学习、双方学习、多内核学习和曲线下区域最大化(AUC)的不同网络。与点学相比,双向学习的样本大小与样本数量及其复杂性成交。研究人员主要通过使用在线学习系统来应对这一挑战。然而,最近的研究提供了适应性样本规模培训,以利顺利损失功能,作为在趋同和复杂方面较好的战略,但没有进行全面的理论研究。在不同的研究系列中,重要取样激发了对有限点和最小化领域的巨大兴趣。这是因为随机性梯度差异导致趋同速度大大放缓。在本文中,我们结合了适应性样本规模和对齐学习的重要抽样技术,同时保证了非湿润的对等性损失功能的趋同性。特别是,该模型还经过培训,利用扩大的培训,以预定义的分级数为基础,对定点点最小和最小的最小的最小值进行最小化。这是由于细度梯度差差异差异,因此,我们从加速的试测测测测了各种数据变变变变的结果。