In this paper, we explore model-based approach to training robust and interpretable binarized regression models for multiclass classification tasks using Mixed-Integer Programming (MIP). Our MIP model balances the optimization of prediction margin and model size by using a weighted objective that: minimizes the total margin of incorrectly classified training instances, maximizes the total margin of correctly classified training instances, and maximizes the overall model regularization. We conduct two sets of experiments to test the classification accuracy of our MIP model over standard and corrupted versions of multiple classification datasets, respectively. In the first set of experiments, we show that our MIP model outperforms an equivalent Pseudo-Boolean Optimization (PBO) model and achieves competitive results to Logistic Regression (LR) and Gradient Descent (GD) in terms of classification accuracy over the standard datasets. In the second set of experiments, we show that our MIP model outperforms the other models (i.e., GD and LR) in terms of classification accuracy over majority of the corrupted datasets. Finally, we visually demonstrate the interpretability of our MIP model in terms of its learned parameters over the MNIST dataset. Overall, we show the effectiveness of training robust and interpretable binarized regression models using MIP.
翻译:在本文中,我们探索了以模型为基础的方法,对使用混合综合编程(MIP)的多级分类任务进行稳健和可解释的二元回归模型进行培训。我们的MIP模型通过使用一个加权目标来平衡预测差值和模型规模的优化,加权目标如下:最大限度地减少不正确分类培训案例的总差值,最大限度地扩大正确分类培训案例的总差值,并最大限度地实现整个模式的正规化。我们分别进行两套实验,以测试我们的MIP模型相对于多种分类数据集的标准版本和腐败版本的分类准确性。在第一批实验中,我们显示我们的MIP模型比大多数腐败数据集的分类准确性优多了Pseudo-Boolean Optim化模型(PBO),并在标准数据集的分类准确性方面,实现后勤累进率(LR)和梯度(GD)的竞争性结果。在第二套实验中,我们显示我们的MIP模型比其他模型(即GD和LRR)的准确性,我们用我们所学到的MIS的模型的精确性标准解释。