We study the problem of learning similarity by using nonlinear embedding models (e.g., neural networks) from all possible pairs. This problem is well-known for its difficulty of training with the extreme number of pairs. For the special case of using linear embeddings, many studies have addressed this issue of handling all pairs by considering certain loss functions and developing efficient optimization algorithms. This paper aims to extend results for general nonlinear embeddings. First, we finish detailed derivations and provide clean formulations for efficiently calculating some building blocks of optimization algorithms such as function, gradient evaluation, and Hessian-vector product. The result enables the use of many optimization methods for extreme similarity learning with nonlinear embeddings. Second, we study some optimization methods in detail. Due to the use of nonlinear embeddings, implementation issues different from linear cases are addressed. In the end, some methods are shown to be highly efficient for extreme similarity learning with nonlinear embeddings.
翻译:我们通过使用所有可能的对子的非线性嵌入模型(例如神经网络)来研究学习相似性的问题。这个问题众所周知,因为很难用极多的对子进行训练。关于使用线性嵌入的特殊案例,许多研究都通过考虑某些损失功能和开发高效优化算法来探讨处理所有对子的问题。本文旨在扩大一般非线性嵌入的结果。首先,我们完成详细推断并提供干净的配方,以便有效计算优化算法的一些构件,例如功能、梯度评估和赫森-矢量产品。结果使得能够使用许多优化方法进行非线性嵌入的极端相似性学习。第二,我们详细研究一些优化方法。由于使用非线性嵌入功能,因此解决了与线性案例不同的执行问题。最后,一些方法证明对与非线性嵌入器的极端相似性学习非常有效。