Pairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. In this work, we present a comprehensive review of pairwise kernels, that have been proposed for incorporating prior knowledge about the relationship between the objects. Specifically, we consider the standard, symmetric and anti-symmetric Kronecker product kernels, metric-learning, Cartesian, ranking, as well as linear, polynomial and Gaussian kernels. Recently, a O(nm + nq) time generalized vec trick algorithm, where n, m, and q denote the number of pairs, drugs and targets, was introduced for training kernel methods with the Kronecker product kernel. This was a significant improvement over previous O(n^2) training methods, since in most real-world applications m,q << n. In this work we show how all the reviewed kernels can be expressed as sums of Kronecker products, allowing the use of generalized vec trick for speeding up their computation. In the experiments, we demonstrate how the introduced approach allows scaling pairwise kernels to much larger data sets than previously feasible, and provide an extensive comparison of the kernels on a number of biological interaction prediction tasks.
翻译:浅浅的学习与监督的学习环境相对应, 目标是对一对对象作出预测。 突出的应用包括预测药物目标或蛋白质- 蛋白质- 蛋白质相互作用, 或客户产品偏好。 在这项工作中, 我们提出对配对的内核进行全面审查, 这是为纳入关于天体之间关系的先前知识而提出的。 具体地说, 我们考虑标准、 对称和反对称的克朗产品内核、 计量学习、 笛卡尔、 排名, 以及线性、 多元和高斯内核。 最近, O( n + nq) 时间通用的vec 游戏算法, 在那里, n、 m 和 q 表示对双、 毒品和目标的数量, 用于与克朗克尔产品内核的训练内核方法。 这比以前的O( n) 培训方法有了重大改进, 因为大多数现实应用中, m, q 以及线性、 和高斯内核和高斯内核的内核内核。 在这项工作中, 我们展示所有经过审查的内核内核内核的内核都能够表现为Kncrequeal 的模拟的模拟实验数据, 使得我们能够大量的快速的模型的模型的模拟的模型的模型的模型的模型的模拟的模型的模拟的模型的模型的模型的模型的模型的模型的模型化。