We seek to impose linear, equality constraints in feedforward neural networks. As top layer predictors are usually nonlinear, this is a difficult task if we seek to deploy standard convex optimization methods and strong duality. To overcome this, we introduce a new saddle-point Lagrangian with auxiliary predictor variables on which constraints are imposed. Elimination of the auxiliary variables leads to a dual minimization problem on the Lagrange multipliers introduced to satisfy the linear constraints. This minimization problem is combined with the standard learning problem on the weight matrices. From this theoretical line of development, we obtain the surprising interpretation of Lagrange parameters as additional, penultimate layer hidden units with fixed weights stemming from the constraints. Consequently, standard minimization approaches can be used despite the inclusion of Lagrange parameters -- a very satisfying, albeit unexpected, discovery. Examples ranging from multi-label classification to constrained autoencoders are envisaged in the future.
翻译:我们试图在饲料向神经网络中强加线性的平等限制。 由于顶层预测器通常是非线性, 如果我们试图使用标准的二次曲线优化方法和强大的双重性, 这是一项艰巨的任务。 为了克服这一点, 我们引入一个新的马鞍点拉格朗吉恩, 并配有附加的预测变量。 消除辅助变量会导致引入的拉格朗格乘数的双重最小化问题, 以满足线性限制。 这个最小化问题与加权矩阵的标准学习问题结合在一起。 我们从这一理论线获得对Lagrange参数的惊人解释, 即额外的、 倒数层隐藏的、 固定重量由限制产生的单位。 因此, 标准最小化方法可以使用, 尽管包括了拉格朗吉参数, 这是一种非常令人满意的、 尽管是意外的发现。 从多标签分类到受限制的自动算数器等例子将会在未来被设想出来。