In this paper, it is demonstrated through a case study that multilayer feedforward neural networks activated by ReLU functions can in principle be trained iteratively with Mixed Integer Linear Programs (MILPs) as follows. Weights are determined with batch learning. Multiple iterations are used per batch of training data. In each iteration, the algorithm starts at the output layer and propagates information back to the first hidden layer to adjust the weights using MILPs or Linear Programs. For each layer, the goal is to minimize the difference between its output and the corresponding target output. The target output of the last (output) layer is equal to the ground truth. The target output of a previous layer is defined as the adjusted input of the following layer. For a given layer, weights are computed by solving a MILP. Then, except for the first hidden layer, the input values are also modified with a MILP to better match the layer outputs to their corresponding target outputs. The method was tested and compared with Tensorflow/Keras (Adam optimizer) using two simple networks on the MNIST dataset containing handwritten digits. Accuracies of the same magnitude as with Tensorflow/Keras were achieved.
翻译:在本文中,通过案例研究可以看出,RELU 函数激活的多层向向神经网络原则上可以与混合整线线程序(MILPs)进行迭代培训,具体如下:通过批量学习确定重量。每批培训数据使用多个迭代。在每次迭代中,算法从输出层开始,将信息传播到第一个隐藏层,使用 MILP 或线性程序调整重量。对于每个层,目标是尽量减少其输出与相应目标输出之间的差别。最后一个(输出)层的目标输出等于地面真相。上一个层的目标输出被定义为下层的调整输入。对于给定的层,则通过解决一个 MILP 来计算重量。然后,除了第一个隐藏层外,输入值也用 MILP 来修改,以更好地将层输出与相应的目标输出相匹配。对方法进行了测试,并且与Tensororpol/Keras(Adam 优化器) 相比,使用两个简单的网络,将AMSMISD 数据流与已实现的手态数字进行了两次测试。