Pairwise learning refers to learning tasks where the loss function depends on a pair of instances. It instantiates many important machine learning tasks such as bipartite ranking and metric learning. A popular approach to handle streaming data in pairwise learning is an online gradient descent (OGD) algorithm, where one needs to pair the current instance with a buffering set of previous instances with a sufficiently large size and therefore suffers from a scalability issue. In this paper, we propose simple stochastic and online gradient descent methods for pairwise learning. A notable difference from the existing studies is that we only pair the current instance with the previous one in building a gradient direction, which is efficient in both the storage and computational complexity. We develop novel stability results, optimization, and generalization error bounds for both convex and nonconvex as well as both smooth and nonsmooth problems. We introduce novel techniques to decouple the dependency of models and the previous instance in both the optimization and generalization analysis. Our study resolves an open question on developing meaningful generalization bounds for OGD using a buffering set with a very small fixed size. We also extend our algorithms and stability analysis to develop differentially private SGD algorithms for pairwise learning which significantly improves the existing results.
翻译:Pairwith 学习指的是学习任务,而损失函数取决于一对一的一对实例。它使许多重要的机器学习任务即刻化为许多重要的学习任务,例如双方排名和量度学习。处理双向学习数据流流的流行方法是一种在线梯度下降算法(OGD),在这个算法中,人们需要将当前实例与一系列具有足够大规模的缓冲性先前实例相匹配,因此也存在一个可缩放问题。在本文件中,我们为双向学习提出了简单的随机和在线梯度下降方法。与现有研究的一个显著区别是,我们只将当前实例与前一个用于建立梯度方向的当前实例相配对,这个方向在存储和计算复杂度方面都是有效的。我们开发了新的稳定性结果、优化和一般化错误,对于 convex 和非convex 来说, 以及光滑和非光滑和非光滑的问题,我们引入了新的技术来分解模型的依赖性和前一种在优化和概括性分析中的前一种实例。我们的研究解决了一个开放的问题,即我们只用一个以非常小的固定大小的缓冲度设定的缓冲法来为OGDDD设定的缓冲而开发一个有意义的一般约束的问题。我们还在大幅改进了我们现有的稳定和普通分析中将现有的分析结果发展了我们之间的平衡结果。