The vanishing ideal of a set of points $X\subseteq \mathbb{R}^n$ is the set of polynomials that evaluate to $0$ over all points $\mathbf{x} \in X$ and admits an efficient representation by a finite set of polynomials called generators. To accommodate the noise in the data set, we introduce the Conditional Gradients Approximately Vanishing Ideal algorithm (CGAVI) for the construction of the set of generators of the approximately vanishing ideal. The constructed set of generators captures polynomial structures in data and gives rise to a feature map that can, for example, be used in combination with a linear classifier for supervised learning. In CGAVI, we construct the set of generators by solving specific instances of (constrained) convex optimization problems with the Pairwise Frank-Wolfe algorithm (PFW). Among other things, the constructed generators inherit the LASSO generalization bound and not only vanish on the training but also on out-sample data. Moreover, CGAVI admits a compact representation of the approximately vanishing ideal by constructing few generators with sparse coefficient vectors.
翻译:一组点的消失理想 $X\ subseteq \ mathb{R ⁇ n$ 是一组多元数字, 对所有点的 $\ mathbf{x} 评为0美元 $\ mathbf{xxxxx$xxx$xxxxxxxxxxxxxxxxxxxx 接受一套有限的多元数字, 并接受一套称作发电机的高效代表。 为了适应数据集中的噪音, 我们引入了“ 大约消失的理论算法( CGAVI) ”, 以构建大约消失的理想的发电机组。 一组发电机在数据中捕获了多功能结构, 并产生了一个功能图, 例如, 它可以与用于监督学习的线性分类器一起使用。 在 CGAVI 中, 我们通过解决特定的例子, 即 Pairwisin Frank- Wolfe 算法( PFFW) 的( ), 我们引入了条件性变数 优化算法( CGASASOV), 等的发电机继承了LASOO 常规约束, 不仅在训练上消失, 不仅在外观数据上, 并且在外加 并存在外存数据上消失。 此外, CGAGAGAVI 的, 和 建立一些 掩 基 基 基 。