Machine learning models are widely used for real-world applications, such as document analysis and vision. Constrained machine learning problems are problems where learned models have to both be accurate and respect constraints. For continuous convex constraints, many works have been proposed, but learning under combinatorial constraints is still a hard problem. The goal of this paper is to broaden the modeling capacity of constrained machine learning problems by incorporating existing work from combinatorial optimization. We propose first a general framework called BaGeL (Branch, Generate and Learn) which applies Branch and Bound to constrained learning problems where a learning problem is generated and trained at each node until only valid models are obtained. Because machine learning has specific requirements, we also propose an extended table constraint to split the space of hypotheses. We validate the approach on two examples: a linear regression under configuration constraints and a non-negative matrix factorization with prior knowledge for latent semantics analysis.
翻译:机械学习模型被广泛用于真实世界的应用,例如文件分析和视觉。 受约束的机器学习问题是学习模型必须既准确又尊重限制的问题。 对于连续的组合制约,已经提出了许多工程,但在组合制约下学习仍然是一个棘手的问题。 本文的目标是通过纳入组合优化的现有工作,扩大受限制的机器学习问题的模型能力。 我们首先提议一个总框架,称为BaGel(Branch, Generate and Learning),它应用分支和Bound来限制学习问题,在每个节点产生学习问题并进行培训,直到只获得有效的模型。 由于机器学习有具体要求,我们还提议扩大表格限制,以分割假体的空间。 我们用两个例子来验证这一方法:配置制约下的线性回归,以及事先掌握潜在语法分析知识的非负式矩阵因素。