LU-Net:基于矩阵要素化的垂直神经网络 (LU-Net: Invertible Neural Networks Based on Matrix Factorization)

LU-Net is a simple and fast architecture for invertible neural networks (INN) that is based on the factorization of quadratic weight matrices $\mathsf{A=LU}$, where $\mathsf{L}$ is a lower triangular matrix with ones on the diagonal and $\mathsf{U}$ an upper triangular matrix. Instead of learning a fully occupied matrix $\mathsf{A}$, we learn $\mathsf{L}$ and $\mathsf{U}$ separately. If combined with an invertible activation function, such layers can easily be inverted whenever the diagonal entries of $\mathsf{U}$ are different from zero. Also, the computation of the determinant of the Jacobian matrix of such layers is cheap. Consequently, the LU architecture allows for cheap computation of the likelihood via the change of variables formula and can be trained according to the maximum likelihood principle. In our numerical experiments, we test the LU-net architecture as generative model on several academic datasets. We also provide a detailed comparison with conventional invertible neural networks in terms of performance, training as well as run time.

翻译：LU- Net 是一个简单而快速的不可逆的神经网络( INN) 结构, 它基于四重力矩阵的系数 $\ mathsfsf{A=LU}$\ mathsf{L}$, 其中$\ mathsf{L}$是一个较低三角矩阵, 与对角和 $\ mathsfsf{U}$ 上三角矩阵是一个较低三角矩阵。因此, LU 结构允许通过变量公式的变化来廉价地计算可能性, 并且可以按照最大可能性原则进行培训。如果与不可逆的激活功能相结合, 只要对角重量矩阵输入 $\ mathsfsf{U} 与零不同, 这种层就可以很容易被倒转。此外, 计算这些层的雅各布矩阵的决定因素是廉价的。因此, LU 结构允许通过变量公式的变化来廉价地计算可能性, 并且可以按照最大的可能性原则进行训练。在我们的数字实验中, 我们测试LU- net 结构作为若干学术时间性数据库的基因化模型, 我们还提供详细的性业绩比较。