The vanishing ideal of a set of points $X = \{\mathbf{x}_1, \ldots, \mathbf{x}_m\}\subseteq \mathbb{R}^n$ is the set of polynomials that evaluate to $0$ over all points $\mathbf{x} \in X$ and admits an efficient representation by a finite subset of generators. In practice, to accommodate noise in the data, algorithms that construct generators of the approximate vanishing ideal are widely studied but their computational complexities remain expensive. In this paper, we scale up the oracle approximate vanishing ideal algorithm (OAVI), the only generator-constructing algorithm with known learning guarantees. We prove that the computational complexity of OAVI is not superlinear, as previously claimed, but linear in the number of samples $m$. In addition, we propose two modifications that accelerate OAVI's training time: Our analysis reveals that replacing the pairwise conditional gradients algorithm, one of the solvers used in OAVI, with the faster blended pairwise conditional gradients algorithm leads to an exponential speed-up in the number of features $n$. Finally, using a new inverse Hessian boosting approach, intermediate convex optimization problems can be solved almost instantly, improving OAVI's training time by multiple orders of magnitude in a variety of numerical experiments.
翻译:一组点的消失理想 $X = @ mathbf{x}x ⁇ 1,\ ldots,\ mathbf{x{m ⁇ subseteq \ mathb{R ⁇ n$ 是一套多式算法, 在所有点上评价为$mathbf{x}xxx$0美元, 并接受有限的一组发电机的有效表示。 在实践中, 为了适应数据中的噪音, 构建近似消失理想的生成者的算法正在得到广泛研究, 但其计算复杂性仍然非常昂贵。 在本文中, 我们提升了最接近于消除理想值的理想算法( OAVI ), 也就是唯一具有已知学习保证的发电机构建算法。 我们证明, OAVI 的计算复杂性不是超直线性, 而是样本数的线性。 此外, 我们提议了两项修改, 加速 OAVI 培训时间的调整: 我们的分析显示, 替代匹配的有条件梯度算法, 一种OAVI 使用的解算法, 的解算法, 在OAVI 中, 中, 近似混合的解式的解式的解算法, ASUx 级平流中, 递增速度法, 的推算方法中, 将一个加速式的加速式的推算法, 的加速式的加速式的推算。