The optimal binning is the optimal discretization of a variable into bins given a discrete or continuous numeric target. We present a rigorous and extensible mathematical programming formulation for solving the optimal binning problem for a binary, continuous and multi-class target type, incorporating constraints not previously addressed. For all three target types, we introduce a convex mixed-integer programming formulation. Several algorithmic enhancements, such as automatic determination of the most suitable monotonic trend via a Machine-Learning-based classifier and implementation aspects are thoughtfully discussed. The new mathematical programming formulations are carefully implemented in the open-source python library OptBinning.
翻译:最佳的硬化是将变量优化地分解成垃圾箱, 给出离散或连续的数字目标。 我们提出了一个严格和可扩展的数学编程配方, 以解决二进制、 连续的和多级的目标类型的最佳编程问题, 其中包括以前未处理的制约因素。 对于所有这三种目标类型, 我们引入了连接混合整数编程配方。 一些算法增强, 如通过机器- 学习分类器自动确定最合适的单调趋势, 以及实施方面 。 新的数学编程配方在开放源 Python 库 OptBinning 中得到了认真实施 。