Eigendecomposition of symmetric matrices is at the heart of many computer vision algorithms. However, the derivatives of the eigenvectors tend to be numerically unstable, whether using the SVD to compute them analytically or using the Power Iteration (PI) method to approximate them. This instability arises in the presence of eigenvalues that are close to each other. This makes integrating eigendecomposition into deep networks difficult and often results in poor convergence, particularly when dealing with large matrices. While this can be mitigated by partitioning the data into small arbitrary groups, doing so has no theoretical basis and makes it impossible to exploit the full power of eigendecomposition. In previous work, we mitigated this using SVD during the forward pass and PI to compute the gradients during the backward pass. However, the iterative deflation procedure required to compute multiple eigenvectors using PI tends to accumulate errors and yield inaccurate gradients. Here, we show that the Taylor expansion of the SVD gradient is theoretically equivalent to the gradient obtained using PI without relying in practice on an iterative process and thus yields more accurate gradients. We demonstrate the benefits of this increased accuracy for image classification and style transfer.
翻译:对称矩阵的 Eigendecomposition 是许多计算机视觉算法的核心。 但是,对称矩阵的衍生物在数字上往往不稳定, 无论是使用 SVD 进行分析性计算, 还是使用 Power Exeration (PI) 方法来估计它们。 这种不稳定性产生于相互接近的 eigenvalue 。 这使得将eigendecommission 整合到深层网络中很难, 并往往导致不准确的趋同, 特别是在处理大矩阵时。 虽然可以通过将数据分成小的任意组来减轻这一点, 但这样做没有理论依据, 使得无法利用 eigendecomposition 的全部能量。 在以往的工作中, 我们用 SVD 来用 SVD 来进行分析, 或用 PI 来计算后向过关的梯度。 但是, 使用 PI 来计算多个 Eigenvistor 往往会积累错误并产生不准确的梯度。 我们在这里表明, SVD 梯度的扩展在理论上等同于使用 PI 的梯度, 而不必在迭接式转换过程中依赖 PI 。