The un-rectifying technique expresses a non-linear point-wise activation function as a data-dependent variable, which means that the activation variable along with its input and output can all be employed in optimization. The ReLU network in this study was un-rectified means that the activation functions could be replaced with data-dependent activation variables in the form of equations and constraints. The discrete nature of activation variables associated with un-rectifying ReLUs allows the reformulation of deep learning problems as problems of combinatorial optimization. However, we demonstrate that the optimal solution to a combinatorial optimization problem can be preserved by relaxing the discrete domains of activation variables to closed intervals. This makes it easier to learn a network using methods developed for real-domain constrained optimization. We also demonstrate that by introducing data-dependent slack variables as constraints, it is possible to optimize a network based on the augmented Lagrangian approach. This means that our method could theoretically achieve global convergence and all limit points are critical points of the learning problem. In experiments, our novel approach to solving the compressed sensing recovery problem achieved state-of-the-art performance when applied to the MNIST database and natural images.
翻译:未经校正的技术表示一种非线性点点感激活功能,作为数据依赖的变量,这意味着激活变量及其输入和输出都可以在优化中使用。本次研究中的RELU网络是未经校正的,这意味着激活功能可以由基于数据的激活变量替换为方程式和制约形式的基于数据的激活变量。与未经校正的ReLU系统相关的激活变量的离散性质允许将深层学习问题重新定位为组合优化问题。然而,我们证明,通过将启动变量的离散区域与封闭间隔来保持组合优化问题的最佳解决方案。这样,使用为实际域限制优化开发的方法学习一个网络就比较容易了。我们还表明,通过采用基于数据的松散变量作为制约,可以优化基于增强的Lagrangian方法的网络。这意味着,我们的方法理论上可以实现全球趋同,所有限制点都是学习问题的关键点。在实验中,我们用新颖的方法来解决在应用到MINIST和自然图像时实现的压缩遥感恢复状态的问题。