The accuracy and complexity of machine learning algorithms based on kernel optimization are determined by the set of kernels over which they are able to optimize. An ideal set of kernels should: admit a linear parameterization (for tractability); be dense in the set of all kernels (for robustness); be universal (for accuracy). Recently, a framework was proposed for using positive matrices to parameterize a class of positive semi-separable kernels. Although this class can be shown to meet all three criteria, previous algorithms for optimization of such kernels were limited to classification and furthermore relied on computationally complex Semidefinite Programming (SDP) algorithms. In this paper, we pose the problem of learning semiseparable kernels as a minimax optimization problem and propose a SVD-QCQP primal-dual algorithm which dramatically reduces the computational complexity as compared with previous SDP-based approaches. Furthermore, we provide an efficient implementation of this algorithm for both classification and regression -- an implementation which enables us to solve problems with 100 features and up to 30,000 datums. Finally, when applied to benchmark data, the algorithm demonstrates the potential for significant improvement in accuracy over typical (but non-convex) approaches such as Neural Nets and Random Forest with similar or better computation time.
翻译:基于核优化的机器学习算法的准确性和复杂性取决于它们能够优化的核函数集。理想的核函数集应该: 具有线性参数化(以实现可计算性); 在所有核函数的集合中密度高(以实现鲁棒性); 是普适的(以实现准确性)。最近,提出了一种利用正定矩阵来参数化一类正半半可分离核的框架。虽然这个类可以满足所有三个标准,但之前针对这种核函数优化的算法仅限于分类任务,并且依赖于计算复杂的半正定规划(SDP)算法。在本文中,我们将学习半可分离核函数的问题提出为一个极小化最大化的优化问题,并提出一个SVD-QCQP原始对偶算法,它大大降低了计算复杂度,相对于之前的SDP方法。此外,我们为分类和回归提供了一种高效实现,该实现使我们能够解决具有100个特征和至多30,000个数据的问题。最后,在应用于基准数据时,该算法表现出与典型(但非凸)方法,如神经网络和随机森林相似或更好的计算时间和显著提高的准确性的潜力。