Binarized Neural Networks (BNNs) are receiving increasing attention due to their lightweight architecture and ability to run on low-power devices. The state-of-the-art for training classification BNNs restricted to few-shot learning is based on a Mixed Integer Programming (MIP) approach. This paper proposes the BeMi ensemble, a structured architecture of BNNs based on training a single BNN for each possible pair of classes and applying a majority voting scheme to predict the final output. The training of a single BNN discriminating between two classes is achieved by a MIP model that optimizes a lexicographic multi-objective function according to robustness and simplicity principles. This approach results in training networks whose output is not affected by small perturbations on the input and whose number of active weights is as small as possible, while good accuracy is preserved. We computationally validate our model using the MNIST and Fashion-MNIST datasets using up to 40 training images per class. Our structured ensemble outperforms both BNNs trained by stochastic gradient descent and state-of-the-art MIP-based approaches. While the previous approaches achieve an average accuracy of 51.1% on the MNIST dataset, the BeMi ensemble achieves an average accuracy of 61.7% when trained with 10 images per class and 76.4% when trained with 40 images per class.
翻译:Binalization Neal Networks(BNN)由于它们的轻量结构和在低功率装置上运行的能力而日益受到越来越多的关注。培训分类BNNS的先进培训工艺仅限于少发学习,其基础是混合整数编程(MIP)方法。本文件提议Bemi 堆,这是BNNS的结构架构,它的基础是为每对可能的课程培训一个单一的BNN,并应用一个多数投票方案来预测最终产出。对两个班进行单一的BNNT区别的培训,通过MIP模型实现两个班之间的单一BNNT区别。MIP模型根据稳健和简单的原则优化一个字典的多目标功能。这种方法的结果是培训网络的输出不受投入小扰动影响,其有效重量数量尽可能小,同时保持良好的准确性。我们使用MNIST和FAshion-MNIST数据集,使用多达40个培训图像。我们结构化的40级组合比BNNS都符合由精度梯梯梯级梯级梯级制的多重功能,在每10MISPMMMMMM 达到一个前一个比例方法时, 达到一个比例。