Recent advances in neural-network architecture allow for seamless integration of convex optimization problems as differentiable layers in an end-to-end trainable neural network. Integrating medium and large scale quadratic programs into a deep neural network architecture, however, is challenging as solving quadratic programs exactly by interior-point methods has worst-case cubic complexity in the number of variables. In this paper, we present an alternative network layer architecture based on the alternating direction method of multipliers (ADMM) that is capable of scaling to problems with a moderately large number of variables. Backward differentiation is performed by implicit differentiation of the residual map of a modified fixed-point iteration. Simulated results demonstrate the computational advantage of the ADMM layer, which for medium scaled problems is approximately an order of magnitude faster than the OptNet quadratic programming layer. Furthermore, our novel backward-pass routine is efficient, from both a memory and computation standpoint, in comparison to the standard approach based on unrolled differentiation or implicit differentiation of the KKT optimality conditions. We conclude with examples from portfolio optimization in the integrated prediction and optimization paradigm.
翻译:神经网络结构的最近进展使得可将锥形优化问题作为端到端可训练神经网络的不同层进行无缝的整合。然而,将中型和大型二次程序纳入深神经网络结构具有挑战性,因为完全通过内点方法解决二次程序在变量数量上具有最坏的立方复杂度。在本文件中,我们提出了一个基于乘数交替方向方法(ADMM)的替代网络层结构,该方法能够扩大成数量不多的变量的问题。后向差异是通过对修改固定点迭代的剩余地图进行隐含的区分来实现的。模拟结果显示ADMM层的计算优势,对于中等规模的问题来说,其规模大约快于 OptNet 二次方位编程。此外,从记忆和计算的观点来看,我们新的后向传输常规效率很高,与基于未更新的差别或对KCT最佳性条件的隐含区别的标准方法相比,我们最后列举了综合预测和优化模式组合优化的实例。